Convolutional Neural Networks

Project: Write an Algorithm for Landmark Classification


In this notebook, some template code has already been provided for you, and you will need to implement additional functionality to successfully complete this project. You will not need to modify the included code beyond what is requested. Sections that begin with '(IMPLEMENTATION)' in the header indicate that the following block of code will require additional functionality which you must provide. Instructions will be provided for each section, and the specifics of the implementation are marked in the code block with a 'TODO' statement. Please be sure to read the instructions carefully!

Note: Once you have completed all the code implementations, you need to finalize your work by exporting the Jupyter Notebook as an HTML document. Before exporting the notebook to HTML, all the code cells need to have been run so that reviewers can see the final implementation and output. You can then export the notebook by using the menu above and navigating to File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question X' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. Markdown cells can be edited by double-clicking the cell to enter edit mode.

The rubric contains optional "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. If you decide to pursue the "Stand Out Suggestions", you should include the code in this Jupyter notebook.


Why We're Here

Photo sharing and photo storage services like to have location data for each photo that is uploaded. With the location data, these services can build advanced features, such as automatic suggestion of relevant tags or automatic photo organization, which help provide a compelling user experience. Although a photo's location can often be obtained by looking at the photo's metadata, many photos uploaded to these services will not have location metadata available. This can happen when, for example, the camera capturing the picture does not have GPS or if a photo's metadata is scrubbed due to privacy concerns.

If no location metadata for an image is available, one way to infer the location is to detect and classify a discernable landmark in the image. Given the large number of landmarks across the world and the immense volume of images that are uploaded to photo sharing services, using human judgement to classify these landmarks would not be feasible.

In this notebook, you will take the first steps towards addressing this problem by building models to automatically predict the location of the image based on any landmarks depicted in the image. At the end of this project, your code will accept any user-supplied image as input and suggest the top k most relevant landmarks from 50 possible landmarks from across the world. The image below displays a potential sample output of your finished project.

Sample landmark classification output

The Road Ahead

We break the notebook into separate steps. Feel free to use the links below to navigate the notebook.

  • Step 0: Download Datasets and Install Python Modules
  • Step 1: Create a CNN to Classify Landmarks (from Scratch)
  • Step 2: Create a CNN to Classify Landmarks (using Transfer Learning)
  • Step 3: Write Your Landmark Prediction Algorithm

Step 0: Download Datasets and Install Python Modules

Note: if you are using the Udacity workspace, YOU CAN SKIP THIS STEP. The dataset can be found in the /data folder and all required Python modules have been installed in the workspace.

Download the landmark dataset. Unzip the folder and place it in this project's home directory, at the location /landmark_images.

Install the following Python modules:

  • cv2
  • matplotlib
  • numpy
  • PIL
  • torch
  • torchvision

Step 1: Create a CNN to Classify Landmarks (from Scratch)

In this step, you will create a CNN that classifies landmarks. You must create your CNN from scratch (so, you can't use transfer learning yet!), and you must attain a test accuracy of at least 20%.

Although 20% may seem low at first glance, it seems more reasonable after realizing how difficult of a problem this is. Many times, an image that is taken at a landmark captures a fairly mundane image of an animal or plant, like in the following picture.

Bird in Haleakalā National Park

Just by looking at that image alone, would you have been able to guess that it was taken at the Haleakalā National Park in Hawaii?

An accuracy of 20% is significantly better than random guessing, which would provide an accuracy of just 2%. In Step 2 of this notebook, you will have the opportunity to greatly improve accuracy by using transfer learning to create a CNN.

Remember that practice is far ahead of theory in deep learning. Experiment with many different architectures, and trust your intuition. And, of course, have fun!

(IMPLEMENTATION) Specify Data Loaders for the Landmark Dataset

Use the code cell below to create three separate data loaders: one for training data, one for validation data, and one for test data. Randomly split the images located at landmark_images/train to create the train and validation data loaders, and use the images located at landmark_images/test to create the test data loader.

Note: Remember that the dataset can be found at /data/landmark_images/ in the workspace.

All three of your data loaders should be accessible via a dictionary named loaders_scratch. Your train data loader should be at loaders_scratch['train'], your validation data loader should be at loaders_scratch['valid'], and your test data loader should be at loaders_scratch['test'].

You may find this documentation on custom datasets to be a useful resource. If you are interested in augmenting your training and/or validation data, check out the wide variety of transforms!

In [1]:
### TODO: Write data loaders for training, validation, and test sets
## Specify appropriate transforms, and batch_sizes
import os
from torchvision import datasets
import torchvision.transforms as transforms
BATCH_SIZE = 32
IMGS_MEAN = [0.485, 0.456, 0.406]
IMGS_STD = [0.229, 0.224, 0.225]
normalize = transforms.Normalize(mean=IMGS_MEAN, # Using means and stds from imagenet
                             std=IMGS_STD)


transform = transforms.Compose([
    transforms.Resize(224),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    normalize])
In [2]:
train_set = datasets.ImageFolder("/data/landmark_images/train", transform=transform) 
In [3]:
from torch.utils.data.sampler import SubsetRandomSampler
import numpy as np
import torch
def train_val_split(dataset, batch_size=16, validation_split=.2, 
                    shuffle_dataset=True, random_seed=42):
    """
    Helper code adapted from: https://stackoverflow.com/questions/50544730/how-do-i-split-a-custom-dataset-into-training-and-test-datasets
    """
    # Creating data indices for training and validation splits:
    dataset_size = len(dataset)
    indices = list(range(dataset_size))
    split = int(np.floor(validation_split * dataset_size))
    if shuffle_dataset :
        np.random.seed(random_seed)
        np.random.shuffle(indices)
    train_indices, val_indices = indices[split:], indices[:split]

    # Creating PT data samplers and loaders:
    train_sampler = SubsetRandomSampler(train_indices)
    valid_sampler = SubsetRandomSampler(val_indices)

    train_loader = torch.utils.data.DataLoader(dataset, 
                                               batch_size=batch_size, 
                                               sampler=train_sampler)
    
    validation_loader = torch.utils.data.DataLoader(dataset, 
                                                    batch_size=batch_size,
                                                    sampler=valid_sampler)
    return train_loader, validation_loader

train_loader, valid_loader = train_val_split(train_set, batch_size=BATCH_SIZE)
In [4]:
test_set = datasets.ImageFolder("/data/landmark_images/test", transform=transform)    

test_loader = torch.utils.data.DataLoader(train_set,
                                          batch_size=BATCH_SIZE)

loaders_scratch = {'train': train_loader, 
                   'valid': valid_loader, 
                   'test': test_loader}
In [5]:
import glob, random
from PIL import Image
img_path = random.choice(glob.glob('/data/landmark_images/train/*/*.jpg'))
img = Image.open(img_path); img
Out[5]:
In [6]:
tr = transforms.Compose([
    transforms.Resize(224),
    #transforms.CenterCrop(224),
])
tr(img)
Out[6]:
In [7]:
tr = transforms.Compose([
    #transforms.Resize(224),
    transforms.CenterCrop(224),
])
tr(img)
Out[7]:
In [8]:
tr = transforms.Compose([
    transforms.Resize(224),
    transforms.CenterCrop(224),
])
tr(img)
Out[8]:
In [ ]:
 

Question 1: Describe your chosen procedure for preprocessing the data.

  • How does your code resize the images (by cropping, stretching, etc)? What size did you pick for the input tensor, and why?
  • Did you decide to augment the dataset? If so, how (through translations, flips, rotations, etc)? If not, why not?

Answer:

  • The images have been Resized to 224 (i.e. the shortest edge has been reduced to 224px instead of the image being shrinked into a square of 224x224) followed by Center Crop of 224 pixels.
  • For simplicity I decided not to augment the dataset. The reason is because I want first a baseline score without that and then later, if necessary, I can compare with more advanced techniques.

(IMPLEMENTATION) Visualize a Batch of Training Data

Use the code cell below to retrieve a batch of images from your train data loader, display at least 5 images simultaneously, and label each displayed image with its class name (e.g., "Golden Gate Bridge").

Visualizing the output of your data loader is a great way to ensure that your data loading and preprocessing are working as expected.

In [9]:
import torchvision
import matplotlib.pyplot as plt
%matplotlib inline

N_IMAGES = 8

def inverse_normalize(tensor, mean, std):
    for t, m, s in zip(tensor, mean, std):
        t.mul_(s).add_(m)
    return tensor

def imshow(img, labels, classes):
    img = inverse_normalize(img, 
                            mean=IMGS_MEAN, 
                            std=IMGS_STD)
    npimg = img.numpy()
    npimg = np.clip(npimg, 0, 1)
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.title([classes[l] for l in labels])
    plt.show()

def show_imgs_from_batch(img_loader, n_imgs, img_set):
    dataiter = iter(img_loader)
    images, labels = dataiter.next()
    images, labels = images[:n_imgs], labels[:n_imgs]
    plt.figure(figsize=(20, 20))
    imshow(torchvision.utils.make_grid(images, nrow=4), labels, img_set.classes)

show_imgs_from_batch(train_loader, N_IMAGES, train_set)
In [10]:
show_imgs_from_batch(test_loader, N_IMAGES, test_set)

Initialize use_cuda variable

In [11]:
# useful variable that tells us whether we should use the GPU
use_cuda = torch.cuda.is_available()

(IMPLEMENTATION) Specify Loss Function and Optimizer

Use the next code cell to specify a loss function and optimizer. Save the chosen loss function as criterion_scratch, and fill in the function get_optimizer_scratch below.

In [12]:
import torch.optim as optim
import torch.nn as nn

## select loss function
criterion_scratch = nn.CrossEntropyLoss()

def get_optimizer_scratch(model, lr=1e-3): # Adding learning rate for fine-tuning later
    ## select and return an optimizer
    optimizer_scratch = optim.SGD(model_scratch.parameters(), lr=lr)
    return optimizer_scratch    

(IMPLEMENTATION) Model Architecture

Create a CNN to classify images of landmarks. Use the template in the code cell below.

In [13]:
n_classes=len(train_set.classes)
n_classes
Out[13]:
50
In [14]:
import torch.nn.functional as F

# define the CNN architecture
class Net(nn.Module):
    ## TODO: choose an architecture, and complete the class
    def __init__(self):
        super(Net, self).__init__()
        
        ## Define layers of a CNN
        self.conv = nn.Conv2d(3,16,5, stride=2)
        self.maxpool = nn.MaxPool2d(2,2)
        self.fc = nn.Linear(55*55*16,50)     
    
    def forward(self, x):
        ## Define forward behavior
        x = self.maxpool(F.relu(self.conv(x)))
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

#-#-# Do NOT modify the code below this line. #-#-#

# instantiate the CNN
model_scratch = Net()

# move tensors to GPU if CUDA is available
if use_cuda:
    model_scratch.cuda()

Question 2: Outline the steps you took to get to your final CNN architecture and your reasoning at each step.

Answer:

I want to try the simplest model first in order to use as baseline. Hence, I selected one single conv layer with maxpool 2x2 followed by a Linear Layer. The kernel size is 5 as the first channel has bigger dimensions and I've selected stride=2 for saving computational power.

(IMPLEMENTATION) Implement the Training Algorithm

Implement your training algorithm in the code cell below. Save the final model parameters at the filepath stored in the variable save_path.

In [15]:
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf 
    
    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        ###################
        # train the model #
        ###################
        # set the module to training mode
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()

            ## TODO: find the loss and update the model parameters accordingly
            ## record the average training loss, using something like
            ## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - train_loss))
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

            train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - train_loss))
            print(f'Batch id {batch_idx}, Training Loss = {train_loss}')

        ######################    
        # validate the model #
        ######################
        # set the model to evaluation mode
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()

            ## TODO: update average validation loss 
            output = model(data)
            loss = criterion(output, target)
            valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - valid_loss))

        # print training/validation statistics 
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
            epoch, 
            train_loss,
            valid_loss
            ))

        ## TODO: if the validation loss has decreased, save the model at the filepath stored in save_path
        if valid_loss <= valid_loss_min:
                print(f'''Validation loss decreased from {valid_loss_min:.6f} to {valid_loss:.6f}. Saving model.''')
                torch.save(model.state_dict(), save_path)
                valid_loss_min = valid_loss
        
    return model

(IMPLEMENTATION) Experiment with the Weight Initialization

Use the code cell below to define a custom weight initialization, and then train with your weight initialization for a few epochs. Make sure that neither the training loss nor validation loss is nan.

Later on, you will be able to see how this compares to training with PyTorch's default weight initialization.

In [16]:
def custom_weight_init(m):
    ##implement a weight initialization strategy
    if isinstance(m, nn.Conv2d): # xavier uniform for the conv layers
        torch.nn.init.xavier_uniform_(m.weight)
        m.bias.data.fill_(0)
        
    if isinstance(m, nn.Linear): #normal distribution when it is linear
        y = m.in_features
        m.weight.data.normal_(0.0,
                              1/np.sqrt(y)) 
        m.bias.data.fill_(0)

#-#-# Do NOT modify the code below this line. #-#-#
    
model_scratch.apply(custom_weight_init)
model_scratch = train(3, loaders_scratch, model_scratch, get_optimizer_scratch(model_scratch),
                      criterion_scratch, use_cuda, 'ignore.pt')
Batch id 0, Training Loss = 4.162134170532227
Batch id 1, Training Loss = 4.043323040008545
Batch id 2, Training Loss = 4.023732582728068
Batch id 3, Training Loss = 4.007790744304657
Batch id 4, Training Loss = 4.016383409500122
Batch id 5, Training Loss = 4.008849859237671
Batch id 6, Training Loss = 4.007618699754987
Batch id 7, Training Loss = 3.9968114495277405
Batch id 8, Training Loss = 3.9961966938442655
Batch id 9, Training Loss = 3.988043117523193
Batch id 10, Training Loss = 3.9857669310136274
Batch id 11, Training Loss = 3.974741816520691
Batch id 12, Training Loss = 3.962545339877789
Batch id 13, Training Loss = 3.9619142838886807
Batch id 14, Training Loss = 3.950221395492554
Batch id 15, Training Loss = 3.9584410339593887
Batch id 16, Training Loss = 3.952392956789802
Batch id 17, Training Loss = 3.9476120604409113
Batch id 18, Training Loss = 3.9385894599713778
Batch id 19, Training Loss = 3.9362573981285096
Batch id 20, Training Loss = 3.9361001650492353
Batch id 21, Training Loss = 3.932297923348167
Batch id 22, Training Loss = 3.9271686388098677
Batch id 23, Training Loss = 3.9218935767809553
Batch id 24, Training Loss = 3.92383547782898
Batch id 25, Training Loss = 3.9207763488476095
Batch id 26, Training Loss = 3.909616108293887
Batch id 27, Training Loss = 3.908943567957197
Batch id 28, Training Loss = 3.9056577600281814
Batch id 29, Training Loss = 3.9047630548477175
Batch id 30, Training Loss = 3.901809753910188
Batch id 31, Training Loss = 3.8996246755123143
Batch id 32, Training Loss = 3.897199377869115
Batch id 33, Training Loss = 3.895337679806878
Batch id 34, Training Loss = 3.8930347578866145
Batch id 35, Training Loss = 3.884039448367225
Batch id 36, Training Loss = 3.878109339121226
Batch id 37, Training Loss = 3.879589388245031
Batch id 38, Training Loss = 3.8788422071016755
Batch id 39, Training Loss = 3.8789337158203128
Batch id 40, Training Loss = 3.881052697577128
Batch id 41, Training Loss = 3.878302625247411
Batch id 42, Training Loss = 3.8734562230664635
Batch id 43, Training Loss = 3.8702205365354367
Batch id 44, Training Loss = 3.8665957662794326
Batch id 45, Training Loss = 3.8640175746834795
Batch id 46, Training Loss = 3.861614983132545
Batch id 47, Training Loss = 3.8614966919024787
Batch id 48, Training Loss = 3.857756877432064
Batch id 49, Training Loss = 3.85858856678009
Batch id 50, Training Loss = 3.859155210794187
Batch id 51, Training Loss = 3.857078079993908
Batch id 52, Training Loss = 3.8571482019604373
Batch id 53, Training Loss = 3.8574107752905946
Batch id 54, Training Loss = 3.859440963918512
Batch id 55, Training Loss = 3.8554843110697603
Batch id 56, Training Loss = 3.8526178995768223
Batch id 57, Training Loss = 3.850942451378394
Batch id 58, Training Loss = 3.8494837688187413
Batch id 59, Training Loss = 3.847548305988311
Batch id 60, Training Loss = 3.8462687007716436
Batch id 61, Training Loss = 3.8460754502204146
Batch id 62, Training Loss = 3.8433689125000474
Batch id 63, Training Loss = 3.842606019228696
Batch id 64, Training Loss = 3.8416995598719663
Batch id 65, Training Loss = 3.8391664642276178
Batch id 66, Training Loss = 3.8384886997849184
Batch id 67, Training Loss = 3.8342619222753176
Batch id 68, Training Loss = 3.832342172014539
Batch id 69, Training Loss = 3.830743186814443
Batch id 70, Training Loss = 3.8285459565444717
Batch id 71, Training Loss = 3.827542470561132
Batch id 72, Training Loss = 3.8279078202704846
Batch id 73, Training Loss = 3.8259273187534215
Batch id 74, Training Loss = 3.826549412409463
Batch id 75, Training Loss = 3.8249784268830935
Batch id 76, Training Loss = 3.8239159738862654
Batch id 77, Training Loss = 3.8225872913996364
Batch id 78, Training Loss = 3.8222661742681177
Batch id 79, Training Loss = 3.822113451361655
Batch id 80, Training Loss = 3.822729475704239
Batch id 81, Training Loss = 3.8190519286365032
Batch id 82, Training Loss = 3.818120080304432
Batch id 83, Training Loss = 3.8169471053850073
Batch id 84, Training Loss = 3.815105906654806
Batch id 85, Training Loss = 3.8124788999557486
Batch id 86, Training Loss = 3.811804176747113
Batch id 87, Training Loss = 3.8099400617859573
Batch id 88, Training Loss = 3.8083347952767697
Batch id 89, Training Loss = 3.806721361478169
Batch id 90, Training Loss = 3.8010635297377027
Batch id 91, Training Loss = 3.7979480390963345
Batch id 92, Training Loss = 3.796331892731369
Batch id 93, Training Loss = 3.7979076010115604
Batch id 94, Training Loss = 3.7968356609344482
Batch id 95, Training Loss = 3.7947918425003686
Batch id 96, Training Loss = 3.7939979931742873
Batch id 97, Training Loss = 3.7942631925855363
Batch id 98, Training Loss = 3.7919606656739204
Batch id 99, Training Loss = 3.7900878334045407
Batch id 100, Training Loss = 3.7903523067436593
Batch id 101, Training Loss = 3.7870455232321043
Batch id 102, Training Loss = 3.783798555725986
Batch id 103, Training Loss = 3.7841991369540873
Batch id 104, Training Loss = 3.7833846183050244
Batch id 105, Training Loss = 3.781744763536273
Batch id 106, Training Loss = 3.7787482292852665
Batch id 107, Training Loss = 3.77887350320816
Batch id 108, Training Loss = 3.7774287670030504
Batch id 109, Training Loss = 3.7768452644348143
Batch id 110, Training Loss = 3.7746383520933957
Batch id 111, Training Loss = 3.773898180041994
Batch id 112, Training Loss = 3.7723606772127405
Batch id 113, Training Loss = 3.7718807019685445
Batch id 114, Training Loss = 3.7706442169521166
Batch id 115, Training Loss = 3.7684801097573906
Batch id 116, Training Loss = 3.768112818400065
Batch id 117, Training Loss = 3.767400319293394
Batch id 118, Training Loss = 3.76541486708056
Batch id 119, Training Loss = 3.7619724055131276
Batch id 120, Training Loss = 3.761319907243587
Batch id 121, Training Loss = 3.761028838939354
Batch id 122, Training Loss = 3.761486824935045
Batch id 123, Training Loss = 3.7607421490453907
Batch id 124, Training Loss = 3.761057134628296
Epoch: 1 	Training Loss: 3.761057 	Validation Loss: 3.596304
Validation loss decreased from inf to 3.596304. Saving model.
Batch id 0, Training Loss = 3.581758975982666
Batch id 1, Training Loss = 3.549558997154236
Batch id 2, Training Loss = 3.4862398306528726
Batch id 3, Training Loss = 3.511155903339386
Batch id 4, Training Loss = 3.5669416427612304
Batch id 5, Training Loss = 3.5766831636428833
Batch id 6, Training Loss = 3.5594266823359897
Batch id 7, Training Loss = 3.5543254017829895
Batch id 8, Training Loss = 3.542958392037286
Batch id 9, Training Loss = 3.5515376329421997
Batch id 10, Training Loss = 3.5401178056543525
Batch id 11, Training Loss = 3.518931031227112
Batch id 12, Training Loss = 3.5300510846651516
Batch id 13, Training Loss = 3.517671653202602
Batch id 14, Training Loss = 3.514204184214274
Batch id 15, Training Loss = 3.511504203081131
Batch id 16, Training Loss = 3.5162420132580925
Batch id 17, Training Loss = 3.517940322558085
Batch id 18, Training Loss = 3.502771515595285
Batch id 19, Training Loss = 3.5013004422187803
Batch id 20, Training Loss = 3.5079716387249174
Batch id 21, Training Loss = 3.5155990448865024
Batch id 22, Training Loss = 3.51329547425975
Batch id 23, Training Loss = 3.5090874830881758
Batch id 24, Training Loss = 3.508743343353272
Batch id 25, Training Loss = 3.504737312977131
Batch id 26, Training Loss = 3.5051758819156227
Batch id 27, Training Loss = 3.4994253345898225
Batch id 28, Training Loss = 3.507114435064382
Batch id 29, Training Loss = 3.507393042246501
Batch id 30, Training Loss = 3.5017151678762133
Batch id 31, Training Loss = 3.502059414982796
Batch id 32, Training Loss = 3.4981183788993144
Batch id 33, Training Loss = 3.4946813443127804
Batch id 34, Training Loss = 3.4896393980298726
Batch id 35, Training Loss = 3.4884228044086036
Batch id 36, Training Loss = 3.4878204062178333
Batch id 37, Training Loss = 3.4821870828929704
Batch id 38, Training Loss = 3.475863927449936
Batch id 39, Training Loss = 3.4667470872402193
Batch id 40, Training Loss = 3.468394192253671
Batch id 41, Training Loss = 3.466608762741089
Batch id 42, Training Loss = 3.466004865114079
Batch id 43, Training Loss = 3.466899351640181
Batch id 44, Training Loss = 3.461647547615899
Batch id 45, Training Loss = 3.463905350021694
Batch id 46, Training Loss = 3.4689242180357587
Batch id 47, Training Loss = 3.4665303130944567
Batch id 48, Training Loss = 3.470266955239432
Batch id 49, Training Loss = 3.474359254837036
Batch id 50, Training Loss = 3.469580383861766
Batch id 51, Training Loss = 3.4709273026539726
Batch id 52, Training Loss = 3.474463548300401
Batch id 53, Training Loss = 3.474182614573726
Batch id 54, Training Loss = 3.4783473058180374
Batch id 55, Training Loss = 3.474702409335545
Batch id 56, Training Loss = 3.4746049370682033
Batch id 57, Training Loss = 3.474456466477493
Batch id 58, Training Loss = 3.474866806450537
Batch id 59, Training Loss = 3.475925250848135
Batch id 60, Training Loss = 3.4769867717242637
Batch id 61, Training Loss = 3.47761957876144
Batch id 62, Training Loss = 3.4770725189693397
Batch id 63, Training Loss = 3.47562300041318
Batch id 64, Training Loss = 3.476606662456806
Batch id 65, Training Loss = 3.475029190381368
Batch id 66, Training Loss = 3.4760464020629427
Batch id 67, Training Loss = 3.4774358097244713
Batch id 68, Training Loss = 3.4765357902084575
Batch id 69, Training Loss = 3.477314308711461
Batch id 70, Training Loss = 3.4756423990491414
Batch id 71, Training Loss = 3.477296676900652
Batch id 72, Training Loss = 3.4784989683595424
Batch id 73, Training Loss = 3.477227326985952
Batch id 74, Training Loss = 3.474428488413493
Batch id 75, Training Loss = 3.4749957605412134
Batch id 76, Training Loss = 3.4721109774205594
Batch id 77, Training Loss = 3.4709086326452403
Batch id 78, Training Loss = 3.4672906972184965
Batch id 79, Training Loss = 3.4632119834423065
Batch id 80, Training Loss = 3.4653803919568476
Batch id 81, Training Loss = 3.466119324288717
Batch id 82, Training Loss = 3.4657882897250625
Batch id 83, Training Loss = 3.4653123219807944
Batch id 84, Training Loss = 3.466009874904857
Batch id 85, Training Loss = 3.4677773824957914
Batch id 86, Training Loss = 3.4673962044989923
Batch id 87, Training Loss = 3.4661778238686645
Batch id 88, Training Loss = 3.4667386419317694
Batch id 89, Training Loss = 3.464423171679179
Batch id 90, Training Loss = 3.4627962295825667
Batch id 91, Training Loss = 3.4629127253656806
Batch id 92, Training Loss = 3.4622044230020177
Batch id 93, Training Loss = 3.461021149412115
Batch id 94, Training Loss = 3.461276159788433
Batch id 95, Training Loss = 3.4606053282817206
Batch id 96, Training Loss = 3.458231323773099
Batch id 97, Training Loss = 3.458006863691369
Batch id 98, Training Loss = 3.4550817422192504
Batch id 99, Training Loss = 3.4577500534057615
Batch id 100, Training Loss = 3.4580845667584104
Batch id 101, Training Loss = 3.4575049760294894
Batch id 102, Training Loss = 3.4590295425896502
Batch id 103, Training Loss = 3.4599372790409966
Batch id 104, Training Loss = 3.4604703948611304
Batch id 105, Training Loss = 3.4625973386584588
Batch id 106, Training Loss = 3.460508188354635
Batch id 107, Training Loss = 3.4609830776850385
Batch id 108, Training Loss = 3.4585396171709815
Batch id 109, Training Loss = 3.459327158060941
Batch id 110, Training Loss = 3.4583553795341975
Batch id 111, Training Loss = 3.4591810532978604
Batch id 112, Training Loss = 3.459081027360089
Batch id 113, Training Loss = 3.455108259853564
Batch id 114, Training Loss = 3.454240944074548
Batch id 115, Training Loss = 3.4519502541114546
Batch id 116, Training Loss = 3.4525669212015266
Batch id 117, Training Loss = 3.4543024645013323
Batch id 118, Training Loss = 3.455721362298276
Batch id 119, Training Loss = 3.4545971353848772
Batch id 120, Training Loss = 3.454140649354162
Batch id 121, Training Loss = 3.4547490135568086
Batch id 122, Training Loss = 3.456341191035945
Batch id 123, Training Loss = 3.4571400265539842
Batch id 124, Training Loss = 3.4569415607452387
Epoch: 2 	Training Loss: 3.456942 	Validation Loss: 3.412524
Validation loss decreased from 3.596304 to 3.412524. Saving model.
Batch id 0, Training Loss = 3.4204134941101074
Batch id 1, Training Loss = 3.2631232738494873
Batch id 2, Training Loss = 3.2617643674214682
Batch id 3, Training Loss = 3.213085353374481
Batch id 4, Training Loss = 3.2129651069641114
Batch id 5, Training Loss = 3.2280699014663696
Batch id 6, Training Loss = 3.2532991000584195
Batch id 7, Training Loss = 3.2748819291591644
Batch id 8, Training Loss = 3.274193207422892
Batch id 9, Training Loss = 3.2694129228591917
Batch id 10, Training Loss = 3.2732884233648125
Batch id 11, Training Loss = 3.262365122636159
Batch id 12, Training Loss = 3.2489974682147684
Batch id 13, Training Loss = 3.244844147137233
Batch id 14, Training Loss = 3.256715981165568
Batch id 15, Training Loss = 3.2721149027347565
Batch id 16, Training Loss = 3.2835731085608986
Batch id 17, Training Loss = 3.2812850872675576
Batch id 18, Training Loss = 3.2836663346541552
Batch id 19, Training Loss = 3.2732705831527706
Batch id 20, Training Loss = 3.2834498882293697
Batch id 21, Training Loss = 3.2883026058023623
Batch id 22, Training Loss = 3.281722949898761
Batch id 23, Training Loss = 3.2888168990612026
Batch id 24, Training Loss = 3.286605720520019
Batch id 25, Training Loss = 3.2884726065855756
Batch id 26, Training Loss = 3.290300987384937
Batch id 27, Training Loss = 3.300907492637634
Batch id 28, Training Loss = 3.303059602605885
Batch id 29, Training Loss = 3.3083546717961623
Batch id 30, Training Loss = 3.2990006862148156
Batch id 31, Training Loss = 3.2978109866380687
Batch id 32, Training Loss = 3.2896949666919126
Batch id 33, Training Loss = 3.2836384913500614
Batch id 34, Training Loss = 3.284365013667515
Batch id 35, Training Loss = 3.289857745170593
Batch id 36, Training Loss = 3.2833233330700846
Batch id 37, Training Loss = 3.2820719355031063
Batch id 38, Training Loss = 3.2745429369119496
Batch id 39, Training Loss = 3.280179226398468
Batch id 40, Training Loss = 3.28217135406122
Batch id 41, Training Loss = 3.2868120670318604
Batch id 42, Training Loss = 3.285218593686126
Batch id 43, Training Loss = 3.2885238961739973
Batch id 44, Training Loss = 3.2935721821255153
Batch id 45, Training Loss = 3.2856068300164263
Batch id 46, Training Loss = 3.286497359580182
Batch id 47, Training Loss = 3.2899083892504373
Batch id 48, Training Loss = 3.281556007813434
Batch id 49, Training Loss = 3.2848284244537354
Batch id 50, Training Loss = 3.282256145103305
Batch id 51, Training Loss = 3.2795198743159957
Batch id 52, Training Loss = 3.2794710780089757
Batch id 53, Training Loss = 3.2792532488151833
Batch id 54, Training Loss = 3.2793575460260564
Batch id 55, Training Loss = 3.280384885413306
Batch id 56, Training Loss = 3.281519208038062
Batch id 57, Training Loss = 3.284392328097902
Batch id 58, Training Loss = 3.2809144399933894
Batch id 59, Training Loss = 3.2793611129124955
Batch id 60, Training Loss = 3.2764713256085503
Batch id 61, Training Loss = 3.2761881197652505
Batch id 62, Training Loss = 3.2689933739011248
Batch id 63, Training Loss = 3.2689649239182472
Batch id 64, Training Loss = 3.2635915829585147
Batch id 65, Training Loss = 3.2622578360817647
Batch id 66, Training Loss = 3.259475241846113
Batch id 67, Training Loss = 3.2552738820805267
Batch id 68, Training Loss = 3.260405989660733
Batch id 69, Training Loss = 3.2630860328674314
Batch id 70, Training Loss = 3.258278020670716
Batch id 71, Training Loss = 3.255449871222178
Batch id 72, Training Loss = 3.254381571730522
Batch id 73, Training Loss = 3.257664854462082
Batch id 74, Training Loss = 3.25713321685791
Batch id 75, Training Loss = 3.261391448347192
Batch id 76, Training Loss = 3.2611941641027276
Batch id 77, Training Loss = 3.2632224223552604
Batch id 78, Training Loss = 3.260030773621571
Batch id 79, Training Loss = 3.261283895373344
Batch id 80, Training Loss = 3.2593089886653566
Batch id 81, Training Loss = 3.260046127365856
Batch id 82, Training Loss = 3.260933798479746
Batch id 83, Training Loss = 3.2607378760973607
Batch id 84, Training Loss = 3.2606813458835373
Batch id 85, Training Loss = 3.2585154633189353
Batch id 86, Training Loss = 3.2575512316035122
Batch id 87, Training Loss = 3.2581292146986174
Batch id 88, Training Loss = 3.259763996252852
Batch id 89, Training Loss = 3.2566473695966924
Batch id 90, Training Loss = 3.258273813750717
Batch id 91, Training Loss = 3.2581543637358616
Batch id 92, Training Loss = 3.25469619997086
Batch id 93, Training Loss = 3.2550329781593152
Batch id 94, Training Loss = 3.2501319157449813
Batch id 95, Training Loss = 3.2511062026023856
Batch id 96, Training Loss = 3.2510526573535086
Batch id 97, Training Loss = 3.2511530603681282
Batch id 98, Training Loss = 3.248169530521739
Batch id 99, Training Loss = 3.246863059997558
Batch id 100, Training Loss = 3.245636078390744
Batch id 101, Training Loss = 3.2464376898372866
Batch id 102, Training Loss = 3.2466522466789165
Batch id 103, Training Loss = 3.245531455828593
Batch id 104, Training Loss = 3.2468999158768423
Batch id 105, Training Loss = 3.2464551768212946
Batch id 106, Training Loss = 3.243784527912318
Batch id 107, Training Loss = 3.24300045437283
Batch id 108, Training Loss = 3.243329564365772
Batch id 109, Training Loss = 3.2460668823935768
Batch id 110, Training Loss = 3.24632183281151
Batch id 111, Training Loss = 3.2460499299424033
Batch id 112, Training Loss = 3.245028407172819
Batch id 113, Training Loss = 3.2476291008162916
Batch id 114, Training Loss = 3.248304369138635
Batch id 115, Training Loss = 3.2481361401492155
Batch id 116, Training Loss = 3.244751901708098
Batch id 117, Training Loss = 3.243228752734298
Batch id 118, Training Loss = 3.2432833739689424
Batch id 119, Training Loss = 3.2453674614429477
Batch id 120, Training Loss = 3.244825853789149
Batch id 121, Training Loss = 3.2452765625031272
Batch id 122, Training Loss = 3.2458429065177117
Batch id 123, Training Loss = 3.2458191802424774
Batch id 124, Training Loss = 3.2463645286560063
Epoch: 3 	Training Loss: 3.246365 	Validation Loss: 3.347000
Validation loss decreased from 3.412524 to 3.347000. Saving model.

(IMPLEMENTATION) Train and Validate the Model

Run the next code cell to train your model.

In [17]:
## TODO: you may change the number of epochs if you'd like,
## but changing it is not required
num_epochs = 3

#-#-# Do NOT modify the code below this line. #-#-#

# function to re-initialize a model with pytorch's default weight initialization
def default_weight_init(m):
    reset_parameters = getattr(m, 'reset_parameters', None)
    if callable(reset_parameters):
        m.reset_parameters()

# reset the model parameters
model_scratch.apply(default_weight_init)

# train the model
model_scratch = train(num_epochs, loaders_scratch, model_scratch, get_optimizer_scratch(model_scratch), 
                      criterion_scratch, use_cuda, 'model_scratch.pt')
Batch id 0, Training Loss = 3.963850975036621
Batch id 1, Training Loss = 3.99538516998291
Batch id 2, Training Loss = 4.014782587687175
Batch id 3, Training Loss = 3.986504912376404
Batch id 4, Training Loss = 3.9946597099304197
Batch id 5, Training Loss = 3.9774711529413858
Batch id 6, Training Loss = 3.9814253193991522
Batch id 7, Training Loss = 3.988382488489151
Batch id 8, Training Loss = 3.987677971522013
Batch id 9, Training Loss = 3.9801257848739624
Batch id 10, Training Loss = 3.9810051267797295
Batch id 11, Training Loss = 3.978820701440175
Batch id 12, Training Loss = 3.9684318762559156
Batch id 13, Training Loss = 3.969104766845703
Batch id 14, Training Loss = 3.966271734237671
Batch id 15, Training Loss = 3.9587236493825912
Batch id 16, Training Loss = 3.9556358702042522
Batch id 17, Training Loss = 3.9604821337593927
Batch id 18, Training Loss = 3.952969588731465
Batch id 19, Training Loss = 3.950272405147553
Batch id 20, Training Loss = 3.9514351912907193
Batch id 21, Training Loss = 3.9480645439841533
Batch id 22, Training Loss = 3.932550171147222
Batch id 23, Training Loss = 3.936803251504898
Batch id 24, Training Loss = 3.935777311325073
Batch id 25, Training Loss = 3.9323554589198184
Batch id 26, Training Loss = 3.927763091193305
Batch id 27, Training Loss = 3.919528365135193
Batch id 28, Training Loss = 3.919254722266362
Batch id 29, Training Loss = 3.9150044282277427
Batch id 30, Training Loss = 3.914332059121901
Batch id 31, Training Loss = 3.9070150926709175
Batch id 32, Training Loss = 3.902297691865401
Batch id 33, Training Loss = 3.9033472537994385
Batch id 34, Training Loss = 3.8982333387647357
Batch id 35, Training Loss = 3.8922731743918524
Batch id 36, Training Loss = 3.8872943117811873
Batch id 37, Training Loss = 3.8851870110160425
Batch id 38, Training Loss = 3.8839089381389127
Batch id 39, Training Loss = 3.8813170790672302
Batch id 40, Training Loss = 3.8776816914721235
Batch id 41, Training Loss = 3.876829686618987
Batch id 42, Training Loss = 3.876230921856193
Batch id 43, Training Loss = 3.8673628622835334
Batch id 44, Training Loss = 3.8644317150115968
Batch id 45, Training Loss = 3.860453486442566
Batch id 46, Training Loss = 3.8571851608601024
Batch id 47, Training Loss = 3.85336413482825
Batch id 48, Training Loss = 3.850683927536011
Batch id 49, Training Loss = 3.851387372016907
Batch id 50, Training Loss = 3.8445616048925064
Batch id 51, Training Loss = 3.8436093788880568
Batch id 52, Training Loss = 3.844110659833224
Batch id 53, Training Loss = 3.840766805189627
Batch id 54, Training Loss = 3.8370888666673135
Batch id 55, Training Loss = 3.8374543743474137
Batch id 56, Training Loss = 3.8336930776897225
Batch id 57, Training Loss = 3.8323572956282512
Batch id 58, Training Loss = 3.8273005687584307
Batch id 59, Training Loss = 3.8247795263926183
Batch id 60, Training Loss = 3.824469038697539
Batch id 61, Training Loss = 3.8226265522741496
Batch id 62, Training Loss = 3.818202128486027
Batch id 63, Training Loss = 3.814236640930175
Batch id 64, Training Loss = 3.81581112054678
Batch id 65, Training Loss = 3.81767113642259
Batch id 66, Training Loss = 3.817126288342831
Batch id 67, Training Loss = 3.814656513578751
Batch id 68, Training Loss = 3.8104238233704493
Batch id 69, Training Loss = 3.806965473720005
Batch id 70, Training Loss = 3.8030238319450693
Batch id 71, Training Loss = 3.8038944005966178
Batch id 72, Training Loss = 3.803406401856304
Batch id 73, Training Loss = 3.8031691055040096
Batch id 74, Training Loss = 3.8041467698415112
Batch id 75, Training Loss = 3.8016599008911527
Batch id 76, Training Loss = 3.8002687801014288
Batch id 77, Training Loss = 3.800026811086214
Batch id 78, Training Loss = 3.8002498180051387
Batch id 79, Training Loss = 3.799966642260551
Batch id 80, Training Loss = 3.8004150743837704
Batch id 81, Training Loss = 3.800108985203068
Batch id 82, Training Loss = 3.8004255266074667
Batch id 83, Training Loss = 3.7970894262904205
Batch id 84, Training Loss = 3.797385768329395
Batch id 85, Training Loss = 3.7958149882250045
Batch id 86, Training Loss = 3.794904821220485
Batch id 87, Training Loss = 3.7945078069513487
Batch id 88, Training Loss = 3.7912023388937612
Batch id 89, Training Loss = 3.789419635136922
Batch id 90, Training Loss = 3.7896066393171033
Batch id 91, Training Loss = 3.786367032838904
Batch id 92, Training Loss = 3.783945975765105
Batch id 93, Training Loss = 3.780860122213972
Batch id 94, Training Loss = 3.7778710942519336
Batch id 95, Training Loss = 3.7758467892805734
Batch id 96, Training Loss = 3.7743413325437563
Batch id 97, Training Loss = 3.7742130732049746
Batch id 98, Training Loss = 3.7738426550470217
Batch id 99, Training Loss = 3.7736705327033997
Batch id 100, Training Loss = 3.768809191071161
Batch id 101, Training Loss = 3.768702488319547
Batch id 102, Training Loss = 3.767186278278388
Batch id 103, Training Loss = 3.7652927935123444
Batch id 104, Training Loss = 3.76492456254505
Batch id 105, Training Loss = 3.7637963227505953
Batch id 106, Training Loss = 3.7612784421332526
Batch id 107, Training Loss = 3.759952185330567
Batch id 108, Training Loss = 3.758547666969649
Batch id 109, Training Loss = 3.75834762616591
Batch id 110, Training Loss = 3.75908646712432
Batch id 111, Training Loss = 3.757428671632494
Batch id 112, Training Loss = 3.7556931508325895
Batch id 113, Training Loss = 3.7554045392755873
Batch id 114, Training Loss = 3.7536813570105507
Batch id 115, Training Loss = 3.7520693602233095
Batch id 116, Training Loss = 3.75220064424042
Batch id 117, Training Loss = 3.7508841348906694
Batch id 118, Training Loss = 3.7475965343603566
Batch id 119, Training Loss = 3.7455149471759794
Batch id 120, Training Loss = 3.7439223360424196
Batch id 121, Training Loss = 3.740594908839366
Batch id 122, Training Loss = 3.7391762365170607
Batch id 123, Training Loss = 3.7376835153948873
Batch id 124, Training Loss = 3.735939769744873
Epoch: 1 	Training Loss: 3.735940 	Validation Loss: 3.538663
Validation loss decreased from inf to 3.538663. Saving model.
Batch id 0, Training Loss = 3.2961065769195557
Batch id 1, Training Loss = 3.4641021490097046
Batch id 2, Training Loss = 3.4567206700642905
Batch id 3, Training Loss = 3.444944202899933
Batch id 4, Training Loss = 3.452593278884888
Batch id 5, Training Loss = 3.454494913419088
Batch id 6, Training Loss = 3.4624114717756
Batch id 7, Training Loss = 3.480534315109253
Batch id 8, Training Loss = 3.450905587938097
Batch id 9, Training Loss = 3.4546823740005497
Batch id 10, Training Loss = 3.4314969236200508
Batch id 11, Training Loss = 3.4116994937260947
Batch id 12, Training Loss = 3.4334719547858605
Batch id 13, Training Loss = 3.441285797527858
Batch id 14, Training Loss = 3.4517593065897625
Batch id 15, Training Loss = 3.4417156875133514
Batch id 16, Training Loss = 3.44711654326495
Batch id 17, Training Loss = 3.4372560315661964
Batch id 18, Training Loss = 3.4315120169990947
Batch id 19, Training Loss = 3.4187442064285283
Batch id 20, Training Loss = 3.4145411763872424
Batch id 21, Training Loss = 3.407958420840177
Batch id 22, Training Loss = 3.4066749966662866
Batch id 23, Training Loss = 3.414306273063024
Batch id 24, Training Loss = 3.419835729598999
Batch id 25, Training Loss = 3.4170833184168887
Batch id 26, Training Loss = 3.414468862392284
Batch id 27, Training Loss = 3.41985422372818
Batch id 28, Training Loss = 3.4234200428272117
Batch id 29, Training Loss = 3.426721692085266
Batch id 30, Training Loss = 3.4182028924265215
Batch id 31, Training Loss = 3.416639469563961
Batch id 32, Training Loss = 3.4143762732997085
Batch id 33, Training Loss = 3.413419379907496
Batch id 34, Training Loss = 3.4116886888231552
Batch id 35, Training Loss = 3.4117539458804664
Batch id 36, Training Loss = 3.4153430590758456
Batch id 37, Training Loss = 3.416692043605604
Batch id 38, Training Loss = 3.4115083767817573
Batch id 39, Training Loss = 3.4138266742229466
Batch id 40, Training Loss = 3.4157127752536685
Batch id 41, Training Loss = 3.4133596874418717
Batch id 42, Training Loss = 3.4118416974710866
Batch id 43, Training Loss = 3.4118728854439477
Batch id 44, Training Loss = 3.4078918668958877
Batch id 45, Training Loss = 3.4083217589751533
Batch id 46, Training Loss = 3.407891811208522
Batch id 47, Training Loss = 3.4058299710353213
Batch id 48, Training Loss = 3.406236984291855
Batch id 49, Training Loss = 3.4056370496749877
Batch id 50, Training Loss = 3.4030895279903035
Batch id 51, Training Loss = 3.4034040432709913
Batch id 52, Training Loss = 3.403357699232281
Batch id 53, Training Loss = 3.4057624251754195
Batch id 54, Training Loss = 3.405377821488814
Batch id 55, Training Loss = 3.400157655988421
Batch id 56, Training Loss = 3.404886986079969
Batch id 57, Training Loss = 3.404607604289877
Batch id 58, Training Loss = 3.401181766542338
Batch id 59, Training Loss = 3.401224939028422
Batch id 60, Training Loss = 3.4016156626529384
Batch id 61, Training Loss = 3.4008092226520663
Batch id 62, Training Loss = 3.399559024780516
Batch id 63, Training Loss = 3.401916518807411
Batch id 64, Training Loss = 3.405682112620427
Batch id 65, Training Loss = 3.40795801263867
Batch id 66, Training Loss = 3.4107371088284166
Batch id 67, Training Loss = 3.4119152461781224
Batch id 68, Training Loss = 3.411369600157807
Batch id 69, Training Loss = 3.411474112101964
Batch id 70, Training Loss = 3.412488897081832
Batch id 71, Training Loss = 3.412438597944048
Batch id 72, Training Loss = 3.4091541048598617
Batch id 73, Training Loss = 3.4061358361630827
Batch id 74, Training Loss = 3.4069557634989422
Batch id 75, Training Loss = 3.404357693697277
Batch id 76, Training Loss = 3.4015815970185517
Batch id 77, Training Loss = 3.4014246708307514
Batch id 78, Training Loss = 3.4027266894714745
Batch id 79, Training Loss = 3.404263186454773
Batch id 80, Training Loss = 3.4003327246065496
Batch id 81, Training Loss = 3.4010680913925175
Batch id 82, Training Loss = 3.400918739387788
Batch id 83, Training Loss = 3.3957717730885464
Batch id 84, Training Loss = 3.393273269428927
Batch id 85, Training Loss = 3.3906037807464604
Batch id 86, Training Loss = 3.3893409158991674
Batch id 87, Training Loss = 3.392316899516366
Batch id 88, Training Loss = 3.3904424142301752
Batch id 89, Training Loss = 3.386527779367235
Batch id 90, Training Loss = 3.3864073831956465
Batch id 91, Training Loss = 3.387205201646556
Batch id 92, Training Loss = 3.388713098341419
Batch id 93, Training Loss = 3.3881504484947693
Batch id 94, Training Loss = 3.3859436762960335
Batch id 95, Training Loss = 3.386311024427414
Batch id 96, Training Loss = 3.387709949434418
Batch id 97, Training Loss = 3.384804100406413
Batch id 98, Training Loss = 3.386227407840767
Batch id 99, Training Loss = 3.3878801369667046
Batch id 100, Training Loss = 3.387194182613108
Batch id 101, Training Loss = 3.3871059090483415
Batch id 102, Training Loss = 3.3886827820713075
Batch id 103, Training Loss = 3.387214431395897
Batch id 104, Training Loss = 3.3864243916102814
Batch id 105, Training Loss = 3.3879212118544664
Batch id 106, Training Loss = 3.3882219078384823
Batch id 107, Training Loss = 3.3886649233323554
Batch id 108, Training Loss = 3.3865585327148433
Batch id 109, Training Loss = 3.386825396797873
Batch id 110, Training Loss = 3.390295028686523
Batch id 111, Training Loss = 3.3884841778448647
Batch id 112, Training Loss = 3.3869801622576414
Batch id 113, Training Loss = 3.3889674220168797
Batch id 114, Training Loss = 3.387562150540559
Batch id 115, Training Loss = 3.3860603838131347
Batch id 116, Training Loss = 3.387872649054242
Batch id 117, Training Loss = 3.386444647433394
Batch id 118, Training Loss = 3.386847704398532
Batch id 119, Training Loss = 3.3867752850055695
Batch id 120, Training Loss = 3.3853284288043817
Batch id 121, Training Loss = 3.383765851865049
Batch id 122, Training Loss = 3.3827496582899634
Batch id 123, Training Loss = 3.380825804125878
Batch id 124, Training Loss = 3.3824371547698973
Epoch: 2 	Training Loss: 3.382437 	Validation Loss: 3.413858
Validation loss decreased from 3.538663 to 3.413858. Saving model.
Batch id 0, Training Loss = 3.3736801147460938
Batch id 1, Training Loss = 3.2891767024993896
Batch id 2, Training Loss = 3.1914073626200357
Batch id 3, Training Loss = 3.1881152987480164
Batch id 4, Training Loss = 3.17077260017395
Batch id 5, Training Loss = 3.133109211921692
Batch id 6, Training Loss = 3.1816835744040355
Batch id 7, Training Loss = 3.138791561126709
Batch id 8, Training Loss = 3.126650439368354
Batch id 9, Training Loss = 3.1395164489746095
Batch id 10, Training Loss = 3.12370870330117
Batch id 11, Training Loss = 3.1241434812545776
Batch id 12, Training Loss = 3.140414494734544
Batch id 13, Training Loss = 3.1618101256234303
Batch id 14, Training Loss = 3.1642396767934162
Batch id 15, Training Loss = 3.1631677597761154
Batch id 16, Training Loss = 3.1592229394351734
Batch id 17, Training Loss = 3.1611402564578586
Batch id 18, Training Loss = 3.1459200382232666
Batch id 19, Training Loss = 3.1533092498779296
Batch id 20, Training Loss = 3.1495005062648227
Batch id 21, Training Loss = 3.1548385728489268
Batch id 22, Training Loss = 3.153688990551492
Batch id 23, Training Loss = 3.1629869540532427
Batch id 24, Training Loss = 3.1691565895080562
Batch id 25, Training Loss = 3.1587700201914855
Batch id 26, Training Loss = 3.158772424415305
Batch id 27, Training Loss = 3.1510819026402057
Batch id 28, Training Loss = 3.158435788647881
Batch id 29, Training Loss = 3.1545789639155064
Batch id 30, Training Loss = 3.1500065788145983
Batch id 31, Training Loss = 3.153059124946594
Batch id 32, Training Loss = 3.147737864292029
Batch id 33, Training Loss = 3.1487369186737957
Batch id 34, Training Loss = 3.1487119129725865
Batch id 35, Training Loss = 3.1540210511949325
Batch id 36, Training Loss = 3.1638414795334273
Batch id 37, Training Loss = 3.168957936136346
Batch id 38, Training Loss = 3.167848281371288
Batch id 39, Training Loss = 3.173450398445129
Batch id 40, Training Loss = 3.1798156063731122
Batch id 41, Training Loss = 3.1797262543723694
Batch id 42, Training Loss = 3.1747164449026415
Batch id 43, Training Loss = 3.1655327352610496
Batch id 44, Training Loss = 3.161878887812296
Batch id 45, Training Loss = 3.1627571116323048
Batch id 46, Training Loss = 3.1687744424698194
Batch id 47, Training Loss = 3.1723391065994893
Batch id 48, Training Loss = 3.171739257111841
Batch id 49, Training Loss = 3.170385904312133
Batch id 50, Training Loss = 3.1713387685663554
Batch id 51, Training Loss = 3.1731552114853487
Batch id 52, Training Loss = 3.1775941039031403
Batch id 53, Training Loss = 3.169348729981316
Batch id 54, Training Loss = 3.164660345424305
Batch id 55, Training Loss = 3.163964646203177
Batch id 56, Training Loss = 3.15625458432917
Batch id 57, Training Loss = 3.161208313086937
Batch id 58, Training Loss = 3.1638906082864535
Batch id 59, Training Loss = 3.168023363749186
Batch id 60, Training Loss = 3.1605614404209326
Batch id 61, Training Loss = 3.1575739152969855
Batch id 62, Training Loss = 3.156782456806728
Batch id 63, Training Loss = 3.161245826631785
Batch id 64, Training Loss = 3.161160388359657
Batch id 65, Training Loss = 3.160574316978455
Batch id 66, Training Loss = 3.1607127616654584
Batch id 67, Training Loss = 3.1581260842435506
Batch id 68, Training Loss = 3.1506786968397065
Batch id 69, Training Loss = 3.1502068826130465
Batch id 70, Training Loss = 3.1521074704720946
Batch id 71, Training Loss = 3.152371373441485
Batch id 72, Training Loss = 3.1502273278693638
Batch id 73, Training Loss = 3.153324314065882
Batch id 74, Training Loss = 3.1550249862670903
Batch id 75, Training Loss = 3.1592527502461487
Batch id 76, Training Loss = 3.156671406386735
Batch id 77, Training Loss = 3.152405527921824
Batch id 78, Training Loss = 3.149650775933568
Batch id 79, Training Loss = 3.1484271109104163
Batch id 80, Training Loss = 3.152807959803829
Batch id 81, Training Loss = 3.1550803824168883
Batch id 82, Training Loss = 3.1539459285965887
Batch id 83, Training Loss = 3.1561538889294583
Batch id 84, Training Loss = 3.155398304322187
Batch id 85, Training Loss = 3.1535302511481356
Batch id 86, Training Loss = 3.15552114070147
Batch id 87, Training Loss = 3.1603692011399707
Batch id 88, Training Loss = 3.1589498841360717
Batch id 89, Training Loss = 3.158392439948188
Batch id 90, Training Loss = 3.1587039114354734
Batch id 91, Training Loss = 3.160377665706303
Batch id 92, Training Loss = 3.159716598449215
Batch id 93, Training Loss = 3.155568244609427
Batch id 94, Training Loss = 3.154377121674387
Batch id 95, Training Loss = 3.1535532300670943
Batch id 96, Training Loss = 3.15524354915029
Batch id 97, Training Loss = 3.154251465992052
Batch id 98, Training Loss = 3.1569269016535597
Batch id 99, Training Loss = 3.153052725791931
Batch id 100, Training Loss = 3.1533692166356757
Batch id 101, Training Loss = 3.1521123787936043
Batch id 102, Training Loss = 3.1512757967976692
Batch id 103, Training Loss = 3.1520871267868924
Batch id 104, Training Loss = 3.155138735544114
Batch id 105, Training Loss = 3.1542927499087354
Batch id 106, Training Loss = 3.1560468495449174
Batch id 107, Training Loss = 3.156096456227479
Batch id 108, Training Loss = 3.159380258770164
Batch id 109, Training Loss = 3.1604502894661644
Batch id 110, Training Loss = 3.156421253273079
Batch id 111, Training Loss = 3.1556864934308186
Batch id 112, Training Loss = 3.1552933697151926
Batch id 113, Training Loss = 3.1555264351660743
Batch id 114, Training Loss = 3.153841906008513
Batch id 115, Training Loss = 3.1558480838249467
Batch id 116, Training Loss = 3.156808516918084
Batch id 117, Training Loss = 3.152624209048384
Batch id 118, Training Loss = 3.1539094908898613
Batch id 119, Training Loss = 3.154531904061635
Batch id 120, Training Loss = 3.152825369322595
Batch id 121, Training Loss = 3.1501726908761944
Batch id 122, Training Loss = 3.153743802047357
Batch id 123, Training Loss = 3.1534842848777767
Batch id 124, Training Loss = 3.153459203720092
Epoch: 3 	Training Loss: 3.153459 	Validation Loss: 3.301605
Validation loss decreased from 3.413858 to 3.301605. Saving model.

(IMPLEMENTATION) Test the Model

Run the code cell below to try out your model on the test dataset of landmark images. Run the code cell below to calculate and print the test loss and accuracy. Ensure that your test accuracy is greater than 20%.

In [18]:
def test(loaders, model, criterion, use_cuda):

    # monitor test loss and accuracy
    test_loss = 0.
    correct = 0.
    total = 0.

    # set the module to evaluation mode
    model.eval()

    for batch_idx, (data, target) in enumerate(loaders['test']):
        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        # calculate the loss
        loss = criterion(output, target)
        # update average test loss 
        test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - test_loss))
        # convert output probabilities to predicted class
        pred = output.data.max(1, keepdim=True)[1]
        # compare predictions to true label
        correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
        total += data.size(0)
            
    print('Test Loss: {:.6f}\n'.format(test_loss))

    print('\nTest Accuracy: %2d%% (%2d/%2d)' % (
        100. * correct / total, correct, total))

# load the model that got the best validation accuracy
model_scratch.load_state_dict(torch.load('model_scratch.pt'))
test(loaders_scratch, model_scratch, criterion_scratch, use_cuda)
Test Loss: 3.020514


Test Accuracy: 25% (1272/4996)

Step 2: Create a CNN to Classify Landmarks (using Transfer Learning)

You will now use transfer learning to create a CNN that can identify landmarks from images. Your CNN must attain at least 60% accuracy on the test set.

(IMPLEMENTATION) Specify Data Loaders for the Landmark Dataset

Use the code cell below to create three separate data loaders: one for training data, one for validation data, and one for test data. Randomly split the images located at landmark_images/train to create the train and validation data loaders, and use the images located at landmark_images/test to create the test data loader.

All three of your data loaders should be accessible via a dictionary named loaders_transfer. Your train data loader should be at loaders_transfer['train'], your validation data loader should be at loaders_transfer['valid'], and your test data loader should be at loaders_transfer['test'].

If you like, you are welcome to use the same data loaders from the previous step, when you created a CNN from scratch.

In [19]:
### TODO: Write data loaders for training, validation, and test sets
## Specify appropriate transforms, and batch_sizes

loaders_transfer = {'train': train_loader, # Using the same data loaders from the previous step
                   'valid': valid_loader, 
                   'test': test_loader}

(IMPLEMENTATION) Specify Loss Function and Optimizer

Use the next code cell to specify a loss function and optimizer. Save the chosen loss function as criterion_transfer, and fill in the function get_optimizer_transfer below.

In [20]:
## TODO: select loss function
criterion_transfer = nn.CrossEntropyLoss()


def get_optimizer_transfer(model, lr=1e-3): # Adding learning rate for fine-tuning later
    ## select and return an optimizer
    optimizer_transfer = optim.SGD(model.parameters(), lr=lr)
    return optimizer_transfer

(IMPLEMENTATION) Model Architecture

Use transfer learning to create a CNN to classify images of landmarks. Use the code cell below, and save your initialized model as the variable model_transfer.

In [21]:
import torchvision.models as models
import torch.nn as nn

## TODO: Specify model architecture

def get_model_transfer(model_name='vgg16'):
    if model_name=='vgg16':
        model_transfer = models.vgg16(pretrained=True)

        # Freeze features parameters
        for param in model_transfer.features.parameters():
            param.require_grad = False

        n_inputs = model_transfer.classifier[6].in_features
        # Keeping unfrozen linear layers with Imagenet's weights as default and updating only last linear layer
        model_transfer.classifier[6] = nn.Linear(n_inputs, 50)
        
    return model_transfer


model_transfer = get_model_transfer('vgg16')


#-#-# Do NOT modify the code below this line. #-#-#

if use_cuda:
    model_transfer = model_transfer.cuda()

Question 3: Outline the steps you took to get to your final CNN architecture and your reasoning at each step. Describe why you think the architecture is suitable for the current problem.

Answer:

I am using the VGG model with pretrained weights from ImageNet. The feature parameters were kept frozen and I am only replacing the last linear layer with a classifier of 50 different outputs (nn.Linear(n_inputs, 50)). The other classifier layers were kept unfrozen but with the Imagenet's weights as starting point. My reasoning here is that there's some semantic in the classifiers that I would like to use as well but fine tune to the Landmarks problem.

(IMPLEMENTATION) Train and Validate the Model

Train and validate your model in the code cell below. Save the final model parameters at filepath 'model_transfer.pt'.

In [22]:
# TODO: train the model and save the best model parameters at filepath 'model_transfer.pt'
model_transfer = train(6, 
                       loaders_transfer, 
                       model_transfer, 
                       get_optimizer_transfer(model_transfer, lr=2e-2), #fine-tuned with lrfinder
                       criterion_transfer, 
                       use_cuda, 
                       'model_transfer.pt')


#-#-# Do NOT modify the code below this line. #-#-#

# load the model that got the best validation accuracy
model_transfer.load_state_dict(torch.load('model_transfer.pt'))
Batch id 0, Training Loss = 4.086527347564697
Batch id 1, Training Loss = 3.9569958448410034
Batch id 2, Training Loss = 3.9604508876800537
Batch id 3, Training Loss = 3.9007903933525085
Batch id 4, Training Loss = 3.8828179359436037
Batch id 5, Training Loss = 3.896682103474935
Batch id 6, Training Loss = 3.830778053828648
Batch id 7, Training Loss = 3.767678916454315
Batch id 8, Training Loss = 3.6960773203108044
Batch id 9, Training Loss = 3.650871419906616
Batch id 10, Training Loss = 3.5951815735210073
Batch id 11, Training Loss = 3.5454651713371277
Batch id 12, Training Loss = 3.48576675928556
Batch id 13, Training Loss = 3.475144164902823
Batch id 14, Training Loss = 3.441463581720988
Batch id 15, Training Loss = 3.420202597975731
Batch id 16, Training Loss = 3.3641575785244213
Batch id 17, Training Loss = 3.339177555508084
Batch id 18, Training Loss = 3.318787072834216
Batch id 19, Training Loss = 3.2975497603416444
Batch id 20, Training Loss = 3.276572386423747
Batch id 21, Training Loss = 3.227351708845659
Batch id 22, Training Loss = 3.176838232123334
Batch id 23, Training Loss = 3.1450683275858564
Batch id 24, Training Loss = 3.118536243438721
Batch id 25, Training Loss = 3.1105015644660363
Batch id 26, Training Loss = 3.0826955724645546
Batch id 27, Training Loss = 3.0525408131735667
Batch id 28, Training Loss = 3.028593236002429
Batch id 29, Training Loss = 3.0277409315109254
Batch id 30, Training Loss = 3.0222654727197464
Batch id 31, Training Loss = 3.003353141248226
Batch id 32, Training Loss = 2.9852104259259775
Batch id 33, Training Loss = 2.9661334963405834
Batch id 34, Training Loss = 2.9273424557277137
Batch id 35, Training Loss = 2.8917666375637054
Batch id 36, Training Loss = 2.8662057889474406
Batch id 37, Training Loss = 2.859207454480623
Batch id 38, Training Loss = 2.8473587280664687
Batch id 39, Training Loss = 2.8346980452537536
Batch id 40, Training Loss = 2.817538906888264
Batch id 41, Training Loss = 2.798360160418919
Batch id 42, Training Loss = 2.776969632437063
Batch id 43, Training Loss = 2.7533706345341424
Batch id 44, Training Loss = 2.74039072725508
Batch id 45, Training Loss = 2.722895549691242
Batch id 46, Training Loss = 2.7137397502331027
Batch id 47, Training Loss = 2.7068311522404356
Batch id 48, Training Loss = 2.6926447566674683
Batch id 49, Training Loss = 2.6784891915321354
Batch id 50, Training Loss = 2.6582449464236992
Batch id 51, Training Loss = 2.637190917363534
Batch id 52, Training Loss = 2.6156044321240124
Batch id 53, Training Loss = 2.610810928874546
Batch id 54, Training Loss = 2.616059472344139
Batch id 55, Training Loss = 2.6063250771590645
Batch id 56, Training Loss = 2.5947752856371697
Batch id 57, Training Loss = 2.5843475851519355
Batch id 58, Training Loss = 2.574254581483744
Batch id 59, Training Loss = 2.5587293326854708
Batch id 60, Training Loss = 2.542921195264723
Batch id 61, Training Loss = 2.532967715494095
Batch id 62, Training Loss = 2.532679389393519
Batch id 63, Training Loss = 2.529803751036525
Batch id 64, Training Loss = 2.523621007112357
Batch id 65, Training Loss = 2.5081338123841723
Batch id 66, Training Loss = 2.5022102185149695
Batch id 67, Training Loss = 2.4950432251481454
Batch id 68, Training Loss = 2.483076261437458
Batch id 69, Training Loss = 2.470308434963227
Batch id 70, Training Loss = 2.457995344215716
Batch id 71, Training Loss = 2.444745081994269
Batch id 72, Training Loss = 2.436205558580895
Batch id 73, Training Loss = 2.433825745775893
Batch id 74, Training Loss = 2.4273767010370895
Batch id 75, Training Loss = 2.4177953632254354
Batch id 76, Training Loss = 2.4083172764096945
Batch id 77, Training Loss = 2.391729975358034
Batch id 78, Training Loss = 2.378212771838225
Batch id 79, Training Loss = 2.368348559737206
Batch id 80, Training Loss = 2.3661130387106066
Batch id 81, Training Loss = 2.3671623293946435
Batch id 82, Training Loss = 2.3594295605119457
Batch id 83, Training Loss = 2.360206646578653
Batch id 84, Training Loss = 2.3550591623081885
Batch id 85, Training Loss = 2.3515766900639203
Batch id 86, Training Loss = 2.3444273430725624
Batch id 87, Training Loss = 2.3405049443244934
Batch id 88, Training Loss = 2.334973175873917
Batch id 89, Training Loss = 2.325519323348999
Batch id 90, Training Loss = 2.3185319166917067
Batch id 91, Training Loss = 2.3185292275055596
Batch id 92, Training Loss = 2.3130562613087315
Batch id 93, Training Loss = 2.3014440650635577
Batch id 94, Training Loss = 2.291347797293412
Batch id 95, Training Loss = 2.2869004731376967
Batch id 96, Training Loss = 2.2805833423260564
Batch id 97, Training Loss = 2.2714671723696656
Batch id 98, Training Loss = 2.2670225338502368
Batch id 99, Training Loss = 2.258433886766434
Batch id 100, Training Loss = 2.253074795892924
Batch id 101, Training Loss = 2.246011772576501
Batch id 102, Training Loss = 2.238038101242585
Batch id 103, Training Loss = 2.228795295724503
Batch id 104, Training Loss = 2.2245987744558433
Batch id 105, Training Loss = 2.219368592748103
Batch id 106, Training Loss = 2.2171686954587426
Batch id 107, Training Loss = 2.207123455074099
Batch id 108, Training Loss = 2.200047854983479
Batch id 109, Training Loss = 2.1905108917843217
Batch id 110, Training Loss = 2.1844689352018345
Batch id 111, Training Loss = 2.177975391702993
Batch id 112, Training Loss = 2.1744298861090066
Batch id 113, Training Loss = 2.1685300291630263
Batch id 114, Training Loss = 2.166006374359131
Batch id 115, Training Loss = 2.1623234080857245
Batch id 116, Training Loss = 2.159304598457793
Batch id 117, Training Loss = 2.152873715101663
Batch id 118, Training Loss = 2.150119546080838
Batch id 119, Training Loss = 2.147120867172877
Batch id 120, Training Loss = 2.1404640260806755
Batch id 121, Training Loss = 2.14018681205687
Batch id 122, Training Loss = 2.130838271079025
Batch id 123, Training Loss = 2.1293122412696963
Batch id 124, Training Loss = 2.122962158203125
Epoch: 1 	Training Loss: 2.122962 	Validation Loss: 1.278186
Validation loss decreased from inf to 1.278186. Saving model.
Batch id 0, Training Loss = 0.9013155102729797
Batch id 1, Training Loss = 0.9933930337429047
Batch id 2, Training Loss = 1.0328390002250671
Batch id 3, Training Loss = 0.9812152832746506
Batch id 4, Training Loss = 1.049412763118744
Batch id 5, Training Loss = 1.0060822467009227
Batch id 6, Training Loss = 1.0625317863055639
Batch id 7, Training Loss = 1.0206879451870918
Batch id 8, Training Loss = 1.0331392486890156
Batch id 9, Training Loss = 1.0499448478221893
Batch id 10, Training Loss = 1.041020685976202
Batch id 11, Training Loss = 1.049708257118861
Batch id 12, Training Loss = 1.0850785328791692
Batch id 13, Training Loss = 1.1002288971628462
Batch id 14, Training Loss = 1.108708699544271
Batch id 15, Training Loss = 1.1040525212883952
Batch id 16, Training Loss = 1.1059278950971718
Batch id 17, Training Loss = 1.1032521393564014
Batch id 18, Training Loss = 1.0821926311442729
Batch id 19, Training Loss = 1.0827776461839678
Batch id 20, Training Loss = 1.0859693941615878
Batch id 21, Training Loss = 1.1010322272777557
Batch id 22, Training Loss = 1.1059033689291582
Batch id 23, Training Loss = 1.1141796633601189
Batch id 24, Training Loss = 1.1211215615272523
Batch id 25, Training Loss = 1.1107860161707952
Batch id 26, Training Loss = 1.0958072587295815
Batch id 27, Training Loss = 1.0950398040669305
Batch id 28, Training Loss = 1.0904524223557834
Batch id 29, Training Loss = 1.1082513511180878
Batch id 30, Training Loss = 1.123576089259117
Batch id 31, Training Loss = 1.14422888122499
Batch id 32, Training Loss = 1.1364237160393686
Batch id 33, Training Loss = 1.1240379354533028
Batch id 34, Training Loss = 1.1184681006840298
Batch id 35, Training Loss = 1.115712583065033
Batch id 36, Training Loss = 1.1110800279153359
Batch id 37, Training Loss = 1.110572178112833
Batch id 38, Training Loss = 1.110390054873931
Batch id 39, Training Loss = 1.1176566779613493
Batch id 40, Training Loss = 1.1088433236610598
Batch id 41, Training Loss = 1.1089602226302737
Batch id 42, Training Loss = 1.109322517417198
Batch id 43, Training Loss = 1.1100349697199734
Batch id 44, Training Loss = 1.1138397375742595
Batch id 45, Training Loss = 1.1076553438020789
Batch id 46, Training Loss = 1.104769136043305
Batch id 47, Training Loss = 1.1107167030374208
Batch id 48, Training Loss = 1.112965415935127
Batch id 49, Training Loss = 1.1048427832126617
Batch id 50, Training Loss = 1.098608650413214
Batch id 51, Training Loss = 1.1104219899727747
Batch id 52, Training Loss = 1.1086458957420204
Batch id 53, Training Loss = 1.1071747872564528
Batch id 54, Training Loss = 1.1051237897439437
Batch id 55, Training Loss = 1.1130017031516348
Batch id 56, Training Loss = 1.116341995565515
Batch id 57, Training Loss = 1.117580589549295
Batch id 58, Training Loss = 1.1090579346074898
Batch id 59, Training Loss = 1.1029309978087745
Batch id 60, Training Loss = 1.092254116886952
Batch id 61, Training Loss = 1.0897993435782771
Batch id 62, Training Loss = 1.0954314536518521
Batch id 63, Training Loss = 1.0953500056639314
Batch id 64, Training Loss = 1.0926288384657639
Batch id 65, Training Loss = 1.0943801511417734
Batch id 66, Training Loss = 1.0906671072120095
Batch id 67, Training Loss = 1.0896745622158048
Batch id 68, Training Loss = 1.0961925033209978
Batch id 69, Training Loss = 1.0948895079748968
Batch id 70, Training Loss = 1.0933333433849708
Batch id 71, Training Loss = 1.0961577494939165
Batch id 72, Training Loss = 1.0941350664177982
Batch id 73, Training Loss = 1.1006376413074697
Batch id 74, Training Loss = 1.1068024627367652
Batch id 75, Training Loss = 1.1121053217273007
Batch id 76, Training Loss = 1.1082630799962325
Batch id 77, Training Loss = 1.1084124805071411
Batch id 78, Training Loss = 1.1087732201890097
Batch id 79, Training Loss = 1.1143713094294068
Batch id 80, Training Loss = 1.113898244169023
Batch id 81, Training Loss = 1.1145815725733592
Batch id 82, Training Loss = 1.1114117804780062
Batch id 83, Training Loss = 1.1101584356455574
Batch id 84, Training Loss = 1.1081918779541462
Batch id 85, Training Loss = 1.1050587941047754
Batch id 86, Training Loss = 1.1074604529073866
Batch id 87, Training Loss = 1.1080333868210963
Batch id 88, Training Loss = 1.1064345240592954
Batch id 89, Training Loss = 1.1001304527123767
Batch id 90, Training Loss = 1.1030868508003568
Batch id 91, Training Loss = 1.1032655141923737
Batch id 92, Training Loss = 1.1024601260821023
Batch id 93, Training Loss = 1.1042572963745034
Batch id 94, Training Loss = 1.0998331929508007
Batch id 95, Training Loss = 1.0959332914402087
Batch id 96, Training Loss = 1.0983357767468875
Batch id 97, Training Loss = 1.1027731804215177
Batch id 98, Training Loss = 1.1038297107725432
Batch id 99, Training Loss = 1.102640615105629
Batch id 100, Training Loss = 1.1017071451290998
Batch id 101, Training Loss = 1.0973802226431228
Batch id 102, Training Loss = 1.0975079600093434
Batch id 103, Training Loss = 1.099431869502251
Batch id 104, Training Loss = 1.1001043847629002
Batch id 105, Training Loss = 1.0987241206304081
Batch id 106, Training Loss = 1.0979388935543666
Batch id 107, Training Loss = 1.0969480077425637
Batch id 108, Training Loss = 1.0975810048777028
Batch id 109, Training Loss = 1.1010936401107094
Batch id 110, Training Loss = 1.0991395953539256
Batch id 111, Training Loss = 1.0976737210793155
Batch id 112, Training Loss = 1.0956736106788163
Batch id 113, Training Loss = 1.0952673575334382
Batch id 114, Training Loss = 1.0927232835603797
Batch id 115, Training Loss = 1.0960119623562385
Batch id 116, Training Loss = 1.0955106244127975
Batch id 117, Training Loss = 1.095863243280831
Batch id 118, Training Loss = 1.0942789035684923
Batch id 119, Training Loss = 1.0944524635871253
Batch id 120, Training Loss = 1.0936429431615784
Batch id 121, Training Loss = 1.0937470959835367
Batch id 122, Training Loss = 1.0919325865381133
Batch id 123, Training Loss = 1.0928498727660025
Batch id 124, Training Loss = 1.0920030879974365
Epoch: 2 	Training Loss: 1.092003 	Validation Loss: 1.208515
Validation loss decreased from 1.278186 to 1.208515. Saving model.
Batch id 0, Training Loss = 0.5755616426467896
Batch id 1, Training Loss = 0.4975958913564682
Batch id 2, Training Loss = 0.6052710910638174
Batch id 3, Training Loss = 0.6499690189957619
Batch id 4, Training Loss = 0.6519889771938324
Batch id 5, Training Loss = 0.6097275813420614
Batch id 6, Training Loss = 0.5524367775235858
Batch id 7, Training Loss = 0.5592393800616264
Batch id 8, Training Loss = 0.5725590454207526
Batch id 9, Training Loss = 0.6058678567409516
Batch id 10, Training Loss = 0.6160713813521645
Batch id 11, Training Loss = 0.5788489853342375
Batch id 12, Training Loss = 0.558160247711035
Batch id 13, Training Loss = 0.5422779108796801
Batch id 14, Training Loss = 0.552536396185557
Batch id 15, Training Loss = 0.5519000627100468
Batch id 16, Training Loss = 0.59523225181243
Batch id 17, Training Loss = 0.605333169301351
Batch id 18, Training Loss = 0.6250168053727402
Batch id 19, Training Loss = 0.6463836759328843
Batch id 20, Training Loss = 0.643395381314414
Batch id 21, Training Loss = 0.638972512700341
Batch id 22, Training Loss = 0.6263014052225196
Batch id 23, Training Loss = 0.6281563937664033
Batch id 24, Training Loss = 0.6362430739402772
Batch id 25, Training Loss = 0.6317011828605946
Batch id 26, Training Loss = 0.6469183299276565
Batch id 27, Training Loss = 0.6439797856978009
Batch id 28, Training Loss = 0.6503198989506427
Batch id 29, Training Loss = 0.6508565644423168
Batch id 30, Training Loss = 0.657455200149167
Batch id 31, Training Loss = 0.6581162363290788
Batch id 32, Training Loss = 0.656250171589129
Batch id 33, Training Loss = 0.6589062915128822
Batch id 34, Training Loss = 0.6614313602447511
Batch id 35, Training Loss = 0.6667066348923578
Batch id 36, Training Loss = 0.6706665174381155
Batch id 37, Training Loss = 0.6807377558005485
Batch id 38, Training Loss = 0.6849428568130886
Batch id 39, Training Loss = 0.6893158078193666
Batch id 40, Training Loss = 0.6843882406630168
Batch id 41, Training Loss = 0.6870993489310856
Batch id 42, Training Loss = 0.6917300917381465
Batch id 43, Training Loss = 0.6926960836757314
Batch id 44, Training Loss = 0.6884967883427939
Batch id 45, Training Loss = 0.6894350907076962
Batch id 46, Training Loss = 0.684128671250445
Batch id 47, Training Loss = 0.6825085567931336
Batch id 48, Training Loss = 0.6808105512541169
Batch id 49, Training Loss = 0.6810138511657716
Batch id 50, Training Loss = 0.6766897065966738
Batch id 51, Training Loss = 0.6826206514468561
Batch id 52, Training Loss = 0.6771237383473596
Batch id 53, Training Loss = 0.6760664764377807
Batch id 54, Training Loss = 0.6785668183456769
Batch id 55, Training Loss = 0.6815416967230185
Batch id 56, Training Loss = 0.6832455067258133
Batch id 57, Training Loss = 0.679631396100439
Batch id 58, Training Loss = 0.6741577684879304
Batch id 59, Training Loss = 0.6758508279919625
Batch id 60, Training Loss = 0.6752129387660105
Batch id 61, Training Loss = 0.6747806817293167
Batch id 62, Training Loss = 0.6758134237357548
Batch id 63, Training Loss = 0.672111667227
Batch id 64, Training Loss = 0.6676774955712832
Batch id 65, Training Loss = 0.6650170873511921
Batch id 66, Training Loss = 0.6661351821315823
Batch id 67, Training Loss = 0.667695924639702
Batch id 68, Training Loss = 0.6641230514084084
Batch id 69, Training Loss = 0.6687812822205681
Batch id 70, Training Loss = 0.665029653361146
Batch id 71, Training Loss = 0.6650571425755819
Batch id 72, Training Loss = 0.6614399944266228
Batch id 73, Training Loss = 0.6584460094973849
Batch id 74, Training Loss = 0.6576327248414359
Batch id 75, Training Loss = 0.6535924292708699
Batch id 76, Training Loss = 0.6606515843372841
Batch id 77, Training Loss = 0.6617320053852522
Batch id 78, Training Loss = 0.6596850628339792
Batch id 79, Training Loss = 0.6616384241729976
Batch id 80, Training Loss = 0.65625505057382
Batch id 81, Training Loss = 0.6565296304662054
Batch id 82, Training Loss = 0.6555274455662234
Batch id 83, Training Loss = 0.659208428646837
Batch id 84, Training Loss = 0.6577493685133318
Batch id 85, Training Loss = 0.6593213590771654
Batch id 86, Training Loss = 0.6588180801649204
Batch id 87, Training Loss = 0.6650011265142398
Batch id 88, Training Loss = 0.6650839672329721
Batch id 89, Training Loss = 0.66528653535578
Batch id 90, Training Loss = 0.6666399335468209
Batch id 91, Training Loss = 0.6667364346592324
Batch id 92, Training Loss = 0.6712210566125891
Batch id 93, Training Loss = 0.6732020355919576
Batch id 94, Training Loss = 0.6721974137582278
Batch id 95, Training Loss = 0.6703138121714196
Batch id 96, Training Loss = 0.6691039173873431
Batch id 97, Training Loss = 0.6655813601552225
Batch id 98, Training Loss = 0.6639730177744472
Batch id 99, Training Loss = 0.6640094828605653
Batch id 100, Training Loss = 0.6650493097777415
Batch id 101, Training Loss = 0.6648782970858556
Batch id 102, Training Loss = 0.6642801397055098
Batch id 103, Training Loss = 0.6665038105386955
Batch id 104, Training Loss = 0.6637596749124073
Batch id 105, Training Loss = 0.6618766742494872
Batch id 106, Training Loss = 0.6640338248738619
Batch id 107, Training Loss = 0.6649068558105716
Batch id 108, Training Loss = 0.6670599287256188
Batch id 109, Training Loss = 0.6677827268838882
Batch id 110, Training Loss = 0.6674887288261103
Batch id 111, Training Loss = 0.6699089793754475
Batch id 112, Training Loss = 0.6694391826085284
Batch id 113, Training Loss = 0.6691640393252958
Batch id 114, Training Loss = 0.6689877466015193
Batch id 115, Training Loss = 0.6710375282784987
Batch id 116, Training Loss = 0.6713260801938863
Batch id 117, Training Loss = 0.6699612138129896
Batch id 118, Training Loss = 0.6695859459768823
Batch id 119, Training Loss = 0.6718616145352521
Batch id 120, Training Loss = 0.6725376846869128
Batch id 121, Training Loss = 0.6722906859683206
Batch id 122, Training Loss = 0.6736303226249971
Batch id 123, Training Loss = 0.6764102075849807
Batch id 124, Training Loss = 0.6770583574771878
Epoch: 3 	Training Loss: 0.677058 	Validation Loss: 1.202628
Validation loss decreased from 1.208515 to 1.202628. Saving model.
Batch id 0, Training Loss = 0.3211105465888977
Batch id 1, Training Loss = 0.4086633324623108
Batch id 2, Training Loss = 0.3471282521883647
Batch id 3, Training Loss = 0.3286784812808037
Batch id 4, Training Loss = 0.3171693742275238
Batch id 5, Training Loss = 0.3499070753653844
Batch id 6, Training Loss = 0.34669816919735497
Batch id 7, Training Loss = 0.40074335411190987
Batch id 8, Training Loss = 0.4163190358214908
Batch id 9, Training Loss = 0.401601055264473
Batch id 10, Training Loss = 0.37732772122729913
Batch id 11, Training Loss = 0.3835146377484004
Batch id 12, Training Loss = 0.39338542406375593
Batch id 13, Training Loss = 0.3837027741330011
Batch id 14, Training Loss = 0.36962080796559654
Batch id 15, Training Loss = 0.36158659867942333
Batch id 16, Training Loss = 0.38252782996963053
Batch id 17, Training Loss = 0.3762388047244814
Batch id 18, Training Loss = 0.3834635034987801
Batch id 19, Training Loss = 0.3860975116491318
Batch id 20, Training Loss = 0.3880608990078881
Batch id 21, Training Loss = 0.383247435092926
Batch id 22, Training Loss = 0.37799749426219775
Batch id 23, Training Loss = 0.3774092035988967
Batch id 24, Training Loss = 0.36807697236537934
Batch id 25, Training Loss = 0.3721553092965713
Batch id 26, Training Loss = 0.37330808738867444
Batch id 27, Training Loss = 0.3700104850743498
Batch id 28, Training Loss = 0.3665712275381746
Batch id 29, Training Loss = 0.3643410975734393
Batch id 30, Training Loss = 0.35750132126192896
Batch id 31, Training Loss = 0.36212295386940246
Batch id 32, Training Loss = 0.3647297313719085
Batch id 33, Training Loss = 0.3661639103118111
Batch id 34, Training Loss = 0.38007137860570633
Batch id 35, Training Loss = 0.3926561483078533
Batch id 36, Training Loss = 0.4001679670166325
Batch id 37, Training Loss = 0.41250030068974747
Batch id 38, Training Loss = 0.40996875289158946
Batch id 39, Training Loss = 0.4134498693048954
Batch id 40, Training Loss = 0.4134894791172772
Batch id 41, Training Loss = 0.4118983255965369
Batch id 42, Training Loss = 0.4151976184789524
Batch id 43, Training Loss = 0.41640996052460233
Batch id 44, Training Loss = 0.4171608103646172
Batch id 45, Training Loss = 0.41589296511981794
Batch id 46, Training Loss = 0.4101014451143589
Batch id 47, Training Loss = 0.40603773668408394
Batch id 48, Training Loss = 0.40216426520931475
Batch id 49, Training Loss = 0.40066452383995055
Batch id 50, Training Loss = 0.3979683471660988
Batch id 51, Training Loss = 0.41319823838197267
Batch id 52, Training Loss = 0.4259256090757982
Batch id 53, Training Loss = 0.44200378325250417
Batch id 54, Training Loss = 0.4465769626877525
Batch id 55, Training Loss = 0.44832285919359754
Batch id 56, Training Loss = 0.4499153578490542
Batch id 57, Training Loss = 0.44966224041478386
Batch id 58, Training Loss = 0.45106999045711454
Batch id 59, Training Loss = 0.44657696634531024
Batch id 60, Training Loss = 0.44273408832120115
Batch id 61, Training Loss = 0.44615174998198787
Batch id 62, Training Loss = 0.444340096816184
Batch id 63, Training Loss = 0.4474549221340567
Batch id 64, Training Loss = 0.4435341371939732
Batch id 65, Training Loss = 0.44816754397117725
Batch id 66, Training Loss = 0.45064720066625674
Batch id 67, Training Loss = 0.44868901678744477
Batch id 68, Training Loss = 0.45031654014103645
Batch id 69, Training Loss = 0.44745997062751214
Batch id 70, Training Loss = 0.4488439916724889
Batch id 71, Training Loss = 0.44850000118215866
Batch id 72, Training Loss = 0.44754836330675085
Batch id 73, Training Loss = 0.45134996723484333
Batch id 74, Training Loss = 0.4512355260054269
Batch id 75, Training Loss = 0.4505486480499567
Batch id 76, Training Loss = 0.44733742595493003
Batch id 77, Training Loss = 0.44586668239954175
Batch id 78, Training Loss = 0.44436169519454605
Batch id 79, Training Loss = 0.4412246784195303
Batch id 80, Training Loss = 0.4388261014296684
Batch id 81, Training Loss = 0.4372918929268673
Batch id 82, Training Loss = 0.43716427432485366
Batch id 83, Training Loss = 0.4359750439013753
Batch id 84, Training Loss = 0.4367815827622133
Batch id 85, Training Loss = 0.4370243597862332
Batch id 86, Training Loss = 0.4356381530049203
Batch id 87, Training Loss = 0.4378942528908903
Batch id 88, Training Loss = 0.43699939666169413
Batch id 89, Training Loss = 0.43648396366172365
Batch id 90, Training Loss = 0.43692606556546554
Batch id 91, Training Loss = 0.43800818401834235
Batch id 92, Training Loss = 0.4357688628858135
Batch id 93, Training Loss = 0.43321494409378536
Batch id 94, Training Loss = 0.43279945756259713
Batch id 95, Training Loss = 0.43589882086962456
Batch id 96, Training Loss = 0.4345156000447027
Batch id 97, Training Loss = 0.43528581851599163
Batch id 98, Training Loss = 0.4432958132690853
Batch id 99, Training Loss = 0.4490099534392356
Batch id 100, Training Loss = 0.4522726580057993
Batch id 101, Training Loss = 0.45434509100867243
Batch id 102, Training Loss = 0.4521581890802938
Batch id 103, Training Loss = 0.4538971416365641
Batch id 104, Training Loss = 0.4515315640540349
Batch id 105, Training Loss = 0.4550878175024715
Batch id 106, Training Loss = 0.4550814311081003
Batch id 107, Training Loss = 0.45189947234811595
Batch id 108, Training Loss = 0.4507331797562607
Batch id 109, Training Loss = 0.45301928235725913
Batch id 110, Training Loss = 0.45829456279406666
Batch id 111, Training Loss = 0.4559068738349845
Batch id 112, Training Loss = 0.45450886477411306
Batch id 113, Training Loss = 0.45625228630868997
Batch id 114, Training Loss = 0.45548067403876247
Batch id 115, Training Loss = 0.4537158608436583
Batch id 116, Training Loss = 0.45527816990501846
Batch id 117, Training Loss = 0.45440555376521596
Batch id 118, Training Loss = 0.4536642990693323
Batch id 119, Training Loss = 0.45473522618412954
Batch id 120, Training Loss = 0.45542793751748123
Batch id 121, Training Loss = 0.4553833020026565
Batch id 122, Training Loss = 0.4549991282505716
Batch id 123, Training Loss = 0.4565441283968185
Batch id 124, Training Loss = 0.46519954800605756
Epoch: 4 	Training Loss: 0.465200 	Validation Loss: 1.903439
Batch id 0, Training Loss = 1.3547337055206299
Batch id 1, Training Loss = 0.9659199416637421
Batch id 2, Training Loss = 0.8211077253023784
Batch id 3, Training Loss = 0.6644974797964096
Batch id 4, Training Loss = 0.5561023861169815
Batch id 5, Training Loss = 0.5073454057176907
Batch id 6, Training Loss = 0.4913288205862045
Batch id 7, Training Loss = 0.44248554483056063
Batch id 8, Training Loss = 0.41518050597773654
Batch id 9, Training Loss = 0.4199530616402626
Batch id 10, Training Loss = 0.398128172213381
Batch id 11, Training Loss = 0.3776811671753724
Batch id 12, Training Loss = 0.3606612831354141
Batch id 13, Training Loss = 0.34557489837918964
Batch id 14, Training Loss = 0.3496861755847931
Batch id 15, Training Loss = 0.3562126196920872
Batch id 16, Training Loss = 0.385179365382475
Batch id 17, Training Loss = 0.39746595753563774
Batch id 18, Training Loss = 0.4022590000378458
Batch id 19, Training Loss = 0.39459324553608893
Batch id 20, Training Loss = 0.38573046028614044
Batch id 21, Training Loss = 0.3739789995280179
Batch id 22, Training Loss = 0.3678629009620003
Batch id 23, Training Loss = 0.3689609058201313
Batch id 24, Training Loss = 0.36739441752433777
Batch id 25, Training Loss = 0.366288892351664
Batch id 26, Training Loss = 0.36478872762786013
Batch id 27, Training Loss = 0.36036648122327664
Batch id 28, Training Loss = 0.3527381913415316
Batch id 29, Training Loss = 0.34620481828848515
Batch id 30, Training Loss = 0.33984229785780745
Batch id 31, Training Loss = 0.33223104104399676
Batch id 32, Training Loss = 0.32507784664630884
Batch id 33, Training Loss = 0.31696691101088237
Batch id 34, Training Loss = 0.3110106519290379
Batch id 35, Training Loss = 0.30957505603631336
Batch id 36, Training Loss = 0.3034308407757733
Batch id 37, Training Loss = 0.30253150431733383
Batch id 38, Training Loss = 0.3045429793687967
Batch id 39, Training Loss = 0.33406449034810065
Batch id 40, Training Loss = 0.35046606601738345
Batch id 41, Training Loss = 0.35856024920940394
Batch id 42, Training Loss = 0.3669643187245657
Batch id 43, Training Loss = 0.3725624551827257
Batch id 44, Training Loss = 0.3701155814859602
Batch id 45, Training Loss = 0.3699848055839538
Batch id 46, Training Loss = 0.3695797019816459
Batch id 47, Training Loss = 0.3662107589965065
Batch id 48, Training Loss = 0.3683721055181659
Batch id 49, Training Loss = 0.3674546858668327
Batch id 50, Training Loss = 0.3669635188930175
Batch id 51, Training Loss = 0.362098071580896
Batch id 52, Training Loss = 0.3577939855321398
Batch id 53, Training Loss = 0.35488508076027586
Batch id 54, Training Loss = 0.353611473468217
Batch id 55, Training Loss = 0.35079017507710625
Batch id 56, Training Loss = 0.3528882695133226
Batch id 57, Training Loss = 0.35278001594646224
Batch id 58, Training Loss = 0.35548292239338664
Batch id 59, Training Loss = 0.35224304857353367
Batch id 60, Training Loss = 0.3506126820308263
Batch id 61, Training Loss = 0.3479598406101427
Batch id 62, Training Loss = 0.3467980170297244
Batch id 63, Training Loss = 0.3459447134518996
Batch id 64, Training Loss = 0.34578939687747223
Batch id 65, Training Loss = 0.34422866040558525
Batch id 66, Training Loss = 0.34395797739722833
Batch id 67, Training Loss = 0.34645907482241883
Batch id 68, Training Loss = 0.3490754200712494
Batch id 69, Training Loss = 0.3509767109794276
Batch id 70, Training Loss = 0.35052833622190316
Batch id 71, Training Loss = 0.3494042467532886
Batch id 72, Training Loss = 0.34661037352395385
Batch id 73, Training Loss = 0.34739280844459663
Batch id 74, Training Loss = 0.3462760866681735
Batch id 75, Training Loss = 0.34906933074326896
Batch id 76, Training Loss = 0.35524059474081193
Batch id 77, Training Loss = 0.35468008397863465
Batch id 78, Training Loss = 0.35521210720644725
Batch id 79, Training Loss = 0.35734226899221544
Batch id 80, Training Loss = 0.35739440250175974
Batch id 81, Training Loss = 0.355936403772453
Batch id 82, Training Loss = 0.35711697758321304
Batch id 83, Training Loss = 0.355244268884971
Batch id 84, Training Loss = 0.3531830823596786
Batch id 85, Training Loss = 0.35396649506549505
Batch id 86, Training Loss = 0.35242643584122607
Batch id 87, Training Loss = 0.3492512659762394
Batch id 88, Training Loss = 0.3510024444608207
Batch id 89, Training Loss = 0.35094047362605735
Batch id 90, Training Loss = 0.34983114267771065
Batch id 91, Training Loss = 0.34944646586866485
Batch id 92, Training Loss = 0.3483994065593648
Batch id 93, Training Loss = 0.3462303124685237
Batch id 94, Training Loss = 0.3480228181732329
Batch id 95, Training Loss = 0.3536210739209007
Batch id 96, Training Loss = 0.35453478876770167
Batch id 97, Training Loss = 0.35750510551187464
Batch id 98, Training Loss = 0.35779794441028084
Batch id 99, Training Loss = 0.3567406579107047
Batch id 100, Training Loss = 0.3554324405175625
Batch id 101, Training Loss = 0.3550228620422822
Batch id 102, Training Loss = 0.35473595465560565
Batch id 103, Training Loss = 0.3556196240421671
Batch id 104, Training Loss = 0.3549574083515576
Batch id 105, Training Loss = 0.3543255296096487
Batch id 106, Training Loss = 0.3526388903226808
Batch id 107, Training Loss = 0.35257510047543933
Batch id 108, Training Loss = 0.35302682022709364
Batch id 109, Training Loss = 0.3533200253817168
Batch id 110, Training Loss = 0.35378600811367633
Batch id 111, Training Loss = 0.35289282198729255
Batch id 112, Training Loss = 0.3511948735181209
Batch id 113, Training Loss = 0.35106476925705604
Batch id 114, Training Loss = 0.3488987370029739
Batch id 115, Training Loss = 0.34856261229463686
Batch id 116, Training Loss = 0.34814922441529406
Batch id 117, Training Loss = 0.34728005673673185
Batch id 118, Training Loss = 0.3475753959863125
Batch id 119, Training Loss = 0.3473350328082839
Batch id 120, Training Loss = 0.3465006348392194
Batch id 121, Training Loss = 0.3483896526034737
Batch id 122, Training Loss = 0.3480192839009005
Batch id 123, Training Loss = 0.34816781882076486
Batch id 124, Training Loss = 0.347687979876995
Epoch: 5 	Training Loss: 0.347688 	Validation Loss: 1.176933
Validation loss decreased from 1.202628 to 1.176933. Saving model.
Batch id 0, Training Loss = 0.31174927949905396
Batch id 1, Training Loss = 0.25243284553289413
Batch id 2, Training Loss = 0.2429057111342748
Batch id 3, Training Loss = 0.24906938895583153
Batch id 4, Training Loss = 0.303421351313591
Batch id 5, Training Loss = 0.2954853648940722
Batch id 6, Training Loss = 0.2646443971565791
Batch id 7, Training Loss = 0.24546946119517085
Batch id 8, Training Loss = 0.2262184065249231
Batch id 9, Training Loss = 0.2144670911133289
Batch id 10, Training Loss = 0.19758624448017637
Batch id 11, Training Loss = 0.19369106553494927
Batch id 12, Training Loss = 0.19792830657500485
Batch id 13, Training Loss = 0.1906604633799621
Batch id 14, Training Loss = 0.18615168531735737
Batch id 15, Training Loss = 0.19806269742548466
Batch id 16, Training Loss = 0.19160698792513678
Batch id 17, Training Loss = 0.18801296088430616
Batch id 18, Training Loss = 0.18669063793985466
Batch id 19, Training Loss = 0.18271433264017103
Batch id 20, Training Loss = 0.17620777232306342
Batch id 21, Training Loss = 0.1696201129393144
Batch id 22, Training Loss = 0.16971289204514545
Batch id 23, Training Loss = 0.16985772425929704
Batch id 24, Training Loss = 0.17308836579322814
Batch id 25, Training Loss = 0.17112756921694827
Batch id 26, Training Loss = 0.17812337809138826
Batch id 27, Training Loss = 0.18064190500548905
Batch id 28, Training Loss = 0.176116662806478
Batch id 29, Training Loss = 0.17413225471973415
Batch id 30, Training Loss = 0.1746697930559035
Batch id 31, Training Loss = 0.17515830788761375
Batch id 32, Training Loss = 0.17411465717084476
Batch id 33, Training Loss = 0.17189019611653156
Batch id 34, Training Loss = 0.1724642877067838
Batch id 35, Training Loss = 0.17920328386955786
Batch id 36, Training Loss = 0.23837631336740542
Batch id 37, Training Loss = 0.26995346303048884
Batch id 38, Training Loss = 0.28434107586359364
Batch id 39, Training Loss = 0.29243370331823826
Batch id 40, Training Loss = 0.2936256153554451
Batch id 41, Training Loss = 0.2919990729008402
Batch id 42, Training Loss = 0.2884270791397538
Batch id 43, Training Loss = 0.29227700084447855
Batch id 44, Training Loss = 0.29386543366644113
Batch id 45, Training Loss = 0.294866153727407
Batch id 46, Training Loss = 0.2947680278027311
Batch id 47, Training Loss = 0.29024332575500006
Batch id 48, Training Loss = 0.28920799341737
Batch id 49, Training Loss = 0.2896313723921775
Batch id 50, Training Loss = 0.2930228724783541
Batch id 51, Training Loss = 0.3011548808560921
Batch id 52, Training Loss = 0.3040471231600023
Batch id 53, Training Loss = 0.30319817722947506
Batch id 54, Training Loss = 0.30538647256114265
Batch id 55, Training Loss = 0.3066394980996847
Batch id 56, Training Loss = 0.30595615584599345
Batch id 57, Training Loss = 0.3044006536746847
Batch id 58, Training Loss = 0.30370058346602874
Batch id 59, Training Loss = 0.3039202243089676
Batch id 60, Training Loss = 0.3073665177235838
Batch id 61, Training Loss = 0.306413626959247
Batch id 62, Training Loss = 0.3021769925715432
Batch id 63, Training Loss = 0.3009862285107375
Batch id 64, Training Loss = 0.2995436377250232
Batch id 65, Training Loss = 0.29877938849456387
Batch id 66, Training Loss = 0.29779933949015036
Batch id 67, Training Loss = 0.2970849374199615
Batch id 68, Training Loss = 0.2950772446566734
Batch id 69, Training Loss = 0.29242328362805503
Batch id 70, Training Loss = 0.2916002995531324
Batch id 71, Training Loss = 0.2926197743250264
Batch id 72, Training Loss = 0.29842520699109115
Batch id 73, Training Loss = 0.304090650903212
Batch id 74, Training Loss = 0.30568827271461485
Batch id 75, Training Loss = 0.3049670261772055
Batch id 76, Training Loss = 0.30263812588406847
Batch id 77, Training Loss = 0.30669496074700964
Batch id 78, Training Loss = 0.3036758115019979
Batch id 79, Training Loss = 0.3019797829911112
Batch id 80, Training Loss = 0.3034893894269142
Batch id 81, Training Loss = 0.3025067395916799
Batch id 82, Training Loss = 0.30327124247349885
Batch id 83, Training Loss = 0.30155577386418975
Batch id 84, Training Loss = 0.299633163038422
Batch id 85, Training Loss = 0.2978009160521418
Batch id 86, Training Loss = 0.2968285537656696
Batch id 87, Training Loss = 0.294894310391762
Batch id 88, Training Loss = 0.29276350676343676
Batch id 89, Training Loss = 0.29130016581879714
Batch id 90, Training Loss = 0.2905371121980331
Batch id 91, Training Loss = 0.2880384086266807
Batch id 92, Training Loss = 0.288941350034488
Batch id 93, Training Loss = 0.2882969569652637
Batch id 94, Training Loss = 0.2863909454722152
Batch id 95, Training Loss = 0.2872969110806782
Batch id 96, Training Loss = 0.2868317222779558
Batch id 97, Training Loss = 0.2859028416628739
Batch id 98, Training Loss = 0.2853349868697348
Batch id 99, Training Loss = 0.28345339044928536
Batch id 100, Training Loss = 0.2839033584488499
Batch id 101, Training Loss = 0.28478217168765896
Batch id 102, Training Loss = 0.28619315916473415
Batch id 103, Training Loss = 0.2896878422739413
Batch id 104, Training Loss = 0.28935513340291513
Batch id 105, Training Loss = 0.2895496688642591
Batch id 106, Training Loss = 0.29137837872883976
Batch id 107, Training Loss = 0.29089331475120994
Batch id 108, Training Loss = 0.2924732069630141
Batch id 109, Training Loss = 0.29221933524716975
Batch id 110, Training Loss = 0.29455075594218993
Batch id 111, Training Loss = 0.294244279153645
Batch id 112, Training Loss = 0.2932216224417222
Batch id 113, Training Loss = 0.2916063929074688
Batch id 114, Training Loss = 0.2910704199386679
Batch id 115, Training Loss = 0.2897678530164833
Batch id 116, Training Loss = 0.289296767523146
Batch id 117, Training Loss = 0.28819083965430814
Batch id 118, Training Loss = 0.2880529621068168
Batch id 119, Training Loss = 0.2872145180900891
Batch id 120, Training Loss = 0.28646244995357567
Batch id 121, Training Loss = 0.2862457389958568
Batch id 122, Training Loss = 0.2862974256277083
Batch id 123, Training Loss = 0.2850004249522762
Batch id 124, Training Loss = 0.28432642710208883
Epoch: 6 	Training Loss: 0.284326 	Validation Loss: 1.324183

(IMPLEMENTATION) Test the Model

Try out your model on the test dataset of landmark images. Use the code cell below to calculate and print the test loss and accuracy. Ensure that your test accuracy is greater than 60%.

In [23]:
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)
Test Loss: 0.307481


Test Accuracy: 92% (4622/4996)

Step 3: Write Your Landmark Prediction Algorithm

Great job creating your CNN models! Now that you have put in all the hard work of creating accurate classifiers, let's define some functions to make it easy for others to use your classifiers.

(IMPLEMENTATION) Write Your Algorithm, Part 1

Implement the function predict_landmarks, which accepts a file path to an image and an integer k, and then predicts the top k most likely landmarks. You are required to use your transfer learned CNN from Step 2 to predict the landmarks.

An example of the expected behavior of predict_landmarks:

>>> predicted_landmarks = predict_landmarks('example_image.jpg', 3)
>>> print(predicted_landmarks)
['Golden Gate Bridge', 'Brooklyn Bridge', 'Sydney Harbour Bridge']
In [24]:
import cv2
from PIL import Image

## the class names can be accessed at the `classes` attribute
## of your dataset object (e.g., `train_dataset.classes`)

def predict_landmarks(img_path, k):
    ## TODO: return the names of the top k landmarks predicted by the transfer learned CNN
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
    
    transform = transforms.Compose([
        transforms.Resize(224),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        normalize
    ])
    
    
    img = Image.open(img_path)
    img = transform(img)
    # Add batch dimension
    img = img.unsqueeze(0)
    if use_cuda:
        img = img.cuda()
        
    model_transfer.eval()
    pred = model_transfer.forward(img)
    _, class_idx = pred.topk(3)
    top_guess=[train_set.classes[i].split('.')[1].replace('_', ' ')  for i in class_idx.tolist()[0]]
    return top_guess
    

# test on a sample image
predict_landmarks('images/test/09.Golden_Gate_Bridge/190f3bae17c32c37.jpg', 5)
Out[24]:
['Golden Gate Bridge', 'Forth Bridge', 'Brooklyn Bridge']

(IMPLEMENTATION) Write Your Algorithm, Part 2

In the code cell below, implement the function suggest_locations, which accepts a file path to an image as input, and then displays the image and the top 3 most likely landmarks as predicted by predict_landmarks.

Some sample output for suggest_locations is provided below, but feel free to design your own user experience!

In [25]:
def visualize_transformed_img(img_path):
    im=Image.open(img_path)
    tr = transforms.Compose([
        transforms.Resize(224),
        transforms.CenterCrop(224)
    ])
    im=tr(im)
    display(im)

def suggest_locations(img_path):
    
    ## TODO: display image and display landmark predictions
    print("Let me analyze the following image:")
    visualize_transformed_img(img_path)
    
    print("Here's my top guesses of where such place belongs:")
    
    # get landmark predictions
    top_guess=predicted_landmarks = predict_landmarks(img_path, 3)
    print(top_guess)
    
    

# test on a sample image
suggest_locations('images/test/09.Golden_Gate_Bridge/190f3bae17c32c37.jpg')
Let me analyze the following image:
Here's my top guesses of where such place belongs:
['Golden Gate Bridge', 'Forth Bridge', 'Brooklyn Bridge']

(IMPLEMENTATION) Test Your Algorithm

Test your algorithm by running the suggest_locations function on at least four images on your computer. Feel free to use any images you like.

Question 4: Is the output better than you expected :) ? Or worse :( ? Provide at least three possible points of improvement for your algorithm.

Answer: The algorithm is running as expected, maybe a little better. With only 6 epochs the model reached 92% which is way higher than the expected benchmark of 60%. As points of improvement:

  • Test transfer learning using more SoTA models like EfficientNet
  • Unfreeze the conv layers and train a little more
  • Experiment with data augmentation techniques like Mixup
In [26]:
## TODO: Execute the `suggest_locations` function on
## at least 4 images on your computer.
## Feel free to use as many code cells as needed.

# Since I don't have touristic places in my computer, I will download them from the internet
!wget https://gooutside-static-cdn.akamaized.net/wp-content/uploads/sites/3/2020/07/quando-reabrir-machu-picchu-reduzira-a-capacidade-para-2-244-visitantes-diarios.jpg -O 'machupicchu.jpg'
--2021-04-04 17:16:08--  https://gooutside-static-cdn.akamaized.net/wp-content/uploads/sites/3/2020/07/quando-reabrir-machu-picchu-reduzira-a-capacidade-para-2-244-visitantes-diarios.jpg
Resolving gooutside-static-cdn.akamaized.net (gooutside-static-cdn.akamaized.net)... 23.35.70.49, 23.35.70.75, 2600:1407:21::17d7:6971, ...
Connecting to gooutside-static-cdn.akamaized.net (gooutside-static-cdn.akamaized.net)|23.35.70.49|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [image/jpeg]
Saving to: ‘machupicchu.jpg’

machupicchu.jpg         [ <=>                ]   1.13M  --.-KB/s    in 0.05s   

2021-04-04 17:16:08 (24.2 MB/s) - ‘machupicchu.jpg’ saved [1189817]

In [27]:
suggest_locations('machupicchu.jpg')
Let me analyze the following image:
Here's my top guesses of where such place belongs:
['Machu Picchu', 'Hanging Temple', 'Eiffel Tower']
In [28]:
!wget https://i.ytimg.com/vi/brZzLyzaXbA/maxresdefault.jpg -O 'badlands.jpg'
--2021-04-04 17:16:09--  https://i.ytimg.com/vi/brZzLyzaXbA/maxresdefault.jpg
Resolving i.ytimg.com (i.ytimg.com)... 173.194.194.119, 2607:f8b0:4001:c10::77
Connecting to i.ytimg.com (i.ytimg.com)|173.194.194.119|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 147180 (144K) [image/jpeg]
Saving to: ‘badlands.jpg’

badlands.jpg        100%[===================>] 143.73K  --.-KB/s    in 0.002s  

2021-04-04 17:16:09 (69.7 MB/s) - ‘badlands.jpg’ saved [147180/147180]

In [29]:
suggest_locations('badlands.jpg')
Let me analyze the following image:
Here's my top guesses of where such place belongs:
['Badlands National Park', 'Grand Canyon', 'Externsteine']
In [30]:
!wget https://greatruns.com/wp-content/uploads/2017/12/Haleakala-National-Park-700x400.jpg -O 'haleakala.jpg'
--2021-04-04 17:16:10--  https://greatruns.com/wp-content/uploads/2017/12/Haleakala-National-Park-700x400.jpg
Resolving greatruns.com (greatruns.com)... 67.225.241.13
Connecting to greatruns.com (greatruns.com)|67.225.241.13|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 79347 (77K) [image/jpeg]
Saving to: ‘haleakala.jpg’

haleakala.jpg       100%[===================>]  77.49K  --.-KB/s    in 0.03s   

2021-04-04 17:16:10 (2.26 MB/s) - ‘haleakala.jpg’ saved [79347/79347]

In [31]:
suggest_locations('haleakala.jpg')
Let me analyze the following image:
Here's my top guesses of where such place belongs:
['Haleakala National Park', 'Machu Picchu', 'Mount Rainier National Park']
In [32]:
!wget https://velvetescape.com/wp-content/uploads/2018/07/IMG_3666-1280x920.jpg -O 'iguazufalls.jpg'
--2021-04-04 17:16:11--  https://velvetescape.com/wp-content/uploads/2018/07/IMG_3666-1280x920.jpg
Resolving velvetescape.com (velvetescape.com)... 172.67.128.137, 104.21.1.49, 2606:4700:3035::6815:131, ...
Connecting to velvetescape.com (velvetescape.com)|172.67.128.137|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 196943 (192K) [image/jpeg]
Saving to: ‘iguazufalls.jpg’

iguazufalls.jpg     100%[===================>] 192.33K  --.-KB/s    in 0.03s   

2021-04-04 17:16:11 (6.08 MB/s) - ‘iguazufalls.jpg’ saved [196943/196943]

In [33]:
suggest_locations('iguazufalls.jpg')
Let me analyze the following image:
Here's my top guesses of where such place belongs:
['Gullfoss Falls', 'Niagara Falls', 'Yellowstone National Park']
In [ ]:
!jupyter nbconvert --execute --to html landmark.ipynb
[NbConvertApp] Converting notebook landmark.ipynb to html
[NbConvertApp] Executing notebook with kernel: python3
In [ ]: